date:20130409

Re: Make change_decl_assembler_name functional with inline clones

2013-04-09 Thread Bernhard Reutner-Fischer


On 8 April 2013 22:08:54 Jan Hubicka hubi...@ucw.cz wrote:

Hi,
this patch makes change_decl_assembler_name to do the right thing with inline
clones.  My original plan was to remove inline clones from assembler_name_hash,
but it hits the problem that we currently need to make them unique for purposes
of LTO sreaming.

It is not hard to walk the clone tree and update it.
Later we can reorg streaming to not rely on uniqueness of symbol names of 
function

bodies not associated with a real symbol and perhaps simplify this somewhat.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

PR lto/54095
* symtab.c (insert_to_assembler_name_hash): Handle clones.
(unlink_from_assembler_name_hash): Likewise.
(symtab_prevail_in_asm_name_hash, symtab_register_node,
symtab_unregister_node, symtab_initialize_asm_name_hash,
change_decl_assembler_name): Update.

Index: symtab.c
===
*** symtab.c(revision 197551)
--- symtab.c(working copy)
*** eq_assembler_name (const void *p1, const
*** 102,108 
  /* Insert NODE to assembler name hash.  */

  static void
! insert_to_assembler_name_hash (symtab_node node)
  {
if (is_a varpool_node (node)  DECL_HARD_REGISTER (node-symbol.decl))
  return;
--- 102,108 
  /* Insert NODE to assembler name hash.  */

  static void
! insert_to_assembler_name_hash (symtab_node node, bool with_clones)
  {
if (is_a varpool_node (node)  DECL_HARD_REGISTER (node-symbol.decl))
  return;
*** insert_to_assembler_name_hash (symtab_no
*** 111,116 
--- 111,119 
if (assembler_name_hash)
  {
void **aslot;
+   struct cgraph_node *cnode;
+   tree decl = node-symbol.decl;
+ tree name = DECL_ASSEMBLER_NAME (node-symbol.decl);


Maybe use decl here?
Thanks,


Sent with AquaMail for Android
http://www.aqua-mail.com

Re: FW: [PATCH] Avoid a few find_base_term calls in alias.c

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 9:42 AM, Igor Zamyatin izamya...@gmail.com wrote:
 Richard,

 Any plans to commit your new fix?

I did already.

Richard.


 Thanks,
 Igor

 On Thu, Apr 4, 2013 at 3:33 PM, Yuri Rumyantsev ysrum...@gmail.com
 wrote:
  Hi Richard,
 
  You slightly change behavior of find_base_term, namely before your fix
  we have the following code:
 
  if (REG_P (tmp1)  REG_POINTER (tmp1))  {
 rtx base = find_base_term (tmp1);
 if (base)
   return base;
   }
 
  but in your fix you added the following checks on base:
 
  tmp1 = find_base_term (tmp1);
  if (tmp1 != NULL_RTX
  ((REG_P (tmp1)  REG_POINTER (tmp1))
  || known_base_value_p (tmp1)))
   return tmp1;
  Why you added these checks since function arguments may not considered
  as bases.

 If you continue down the original function there is (quoted fully)

 /* If either operand is known to be a pointer, then use it
to determine the base term.  */
 if (REG_P (tmp1)  REG_POINTER (tmp1))
   {
 rtx base = find_base_term (tmp1);
 if (base)
   return base;
   }

 if (REG_P (tmp2)  REG_POINTER (tmp2))
   {
 rtx base = find_base_term (tmp2);
 if (base)
   return base;
   }

 /* Neither operand was known to be a pointer.  Go ahead and find
 the
base term for both operands.  */
 tmp1 = find_base_term (tmp1);
 tmp2 = find_base_term (tmp2);

 so we call find_base_term on tmp1 anyway, and after the modification

 tmp1 = find_base_term (tmp1);
 if (tmp1 != NULL_RTX
  ((REG_P (tmp1)  REG_POINTER (tmp1))
  || known_base_value_p (tmp1)))
   return tmp1;

 we take it if it is non-NULL and ... ah, I see.  We now check
 REG_P/REG_POINTER on the new value.  That was indeed unintended.

 I'll fix it like below.

 Richard.

 2013-04-04  Richard Biener  rguent...@suse.de

 * alias.c (find_base_term): Fix thinko in previous change.

 Index: gcc/alias.c
 ===
 --- gcc/alias.c (revision 197480)
 +++ gcc/alias.c (working copy)
 @@ -1687,16 +1687,16 @@ find_base_term (rtx x)
term is from a pointer or is a named object or a special
 address
(like an argument or stack reference), then use it for the
base term.  */
 -   tmp1 = find_base_term (tmp1);
 -   if (tmp1 != NULL_RTX
 +   rtx base = find_base_term (tmp1);
 +   if (base != NULL_RTX
  ((REG_P (tmp1)  REG_POINTER (tmp1))
 -|| known_base_value_p (tmp1)))
 - return tmp1;
 -   tmp2 = find_base_term (tmp2);
 -   if (tmp2 != NULL_RTX
 +|| known_base_value_p (base)))
 + return base;
 +   base = find_base_term (tmp2);
 +   if (base != NULL_RTX
  ((REG_P (tmp2)  REG_POINTER (tmp2))
 -|| known_base_value_p (tmp2)))
 - return tmp2;
 +|| known_base_value_p (base)))
 + return base;

 /* We could not determine which of the two operands was the
base register and which was the index.  So we can determine

[PATCH, trivial] Make type_hash_lookup, type_hash_add static

2013-04-09 Thread Maxim Kuvyrkov

This patch makes type_hash_lookup and type_hash_add in tree.c static.  These 
functions are used only within tree.c, and I don't think there is any reason to 
export low-level interface to type hashing anyway.

Will commit in 2 days unless someone complains.  Bootstrapped on 
x86_64-linux-gnu.

Thanks,

--
Maxim Kuvyrkov
KugelWorks




gcc-static-type-hash.ChangeLog
Description: Binary data


gcc-static-type-hash.patch
Description: Binary data

[PATCH] Remove unsave_expr_now

2013-04-09 Thread Richard Biener


In propagate_tree_value and replace_exp_1 operating on SSA form
are the only users of unsave_expr_now where really unsharing
matters.  Thus, use unshare_expr and remove unsave_expr_now.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-09  Richard Biener  rguent...@suse.de

* tree.h (unsave_expr_now): Remove.
* tree-inline.c (mark_local_for_remap_r): Remove.
(unsave_expr_1): Likewise.
(unsave_r): Likewise.
(unsave_expr_now): Likewise.
* tree-ssa-copy.c (replace_exp_1): Use unshare_expr.
(propagate_tree_value): Likewise.

Index: gcc/tree.h
===
*** gcc/tree.h  (revision 197574)
--- gcc/tree.h  (working copy)
*** extern void indent_to (FILE *, int);
*** 6018,6024 
  extern bool debug_find_tree (tree, tree);
  /* This is in tree-inline.c since the routine uses
 data structures from the inliner.  */
- extern tree unsave_expr_now (tree);
  extern tree build_duplicate_type (tree);
  
  /* In calls.c */
--- 6018,6023 
Index: gcc/tree-inline.c
===
*** gcc/tree-inline.c   (revision 197574)
--- gcc/tree-inline.c   (working copy)
*** eni_weights eni_time_weights;
*** 114,122 
  static tree declare_return_variable (copy_body_data *, tree, tree, 
basic_block);
  static void remap_block (tree *, copy_body_data *);
  static void copy_bind_expr (tree *, int *, copy_body_data *);
- static tree mark_local_for_remap_r (tree *, int *, void *);
- static void unsave_expr_1 (tree);
- static tree unsave_r (tree *, int *, void *);
  static void declare_inline_vars (tree, tree);
  static void remap_save_expr (tree *, void *, int *);
  static void prepend_lexical_block (tree current_block, tree new_block);
--- 114,119 
*** remap_save_expr (tree *tp, void *st_, in
*** 4465,4601 
*tp = t;
  }
  
- /* Called via walk_tree.  If *TP points to a DECL_STMT for a local label,
-copies the declaration and enters it in the splay_tree in DATA (which is
-really an `copy_body_data *').  */
- 
- static tree
- mark_local_for_remap_r (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED,
-   void *data)
- {
-   copy_body_data *id = (copy_body_data *) data;
- 
-   /* Don't walk into types.  */
-   if (TYPE_P (*tp))
- *walk_subtrees = 0;
- 
-   else if (TREE_CODE (*tp) == LABEL_EXPR)
- {
-   tree decl = TREE_OPERAND (*tp, 0);
- 
-   /* Copy the decl and remember the copy.  */
-   insert_decl_map (id, decl, id-copy_decl (decl, id));
- }
- 
-   return NULL_TREE;
- }
- 
- /* Perform any modifications to EXPR required when it is unsaved.  Does
-not recurse into EXPR's subtrees.  */
- 
- static void
- unsave_expr_1 (tree expr)
- {
-   switch (TREE_CODE (expr))
- {
- case TARGET_EXPR:
-   /* Don't mess with a TARGET_EXPR that hasn't been expanded.
-  It's OK for this to happen if it was part of a subtree that
-  isn't immediately expanded, such as operand 2 of another
-  TARGET_EXPR.  */
-   if (TREE_OPERAND (expr, 1))
-   break;
- 
-   TREE_OPERAND (expr, 1) = TREE_OPERAND (expr, 3);
-   TREE_OPERAND (expr, 3) = NULL_TREE;
-   break;
- 
- default:
-   break;
- }
- }
- 
- /* Called via walk_tree when an expression is unsaved.  Using the
-splay_tree pointed to by ST (which is really a `splay_tree'),
-remaps all local declarations to appropriate replacements.  */
- 
- static tree
- unsave_r (tree *tp, int *walk_subtrees, void *data)
- {
-   copy_body_data *id = (copy_body_data *) data;
-   struct pointer_map_t *st = id-decl_map;
-   tree *n;
- 
-   /* Only a local declaration (variable or label).  */
-   if ((TREE_CODE (*tp) == VAR_DECL  !TREE_STATIC (*tp))
-   || TREE_CODE (*tp) == LABEL_DECL)
- {
-   /* Lookup the declaration.  */
-   n = (tree *) pointer_map_contains (st, *tp);
- 
-   /* If it's there, remap it.  */
-   if (n)
-   *tp = *n;
- }
- 
-   else if (TREE_CODE (*tp) == STATEMENT_LIST)
- gcc_unreachable ();
-   else if (TREE_CODE (*tp) == BIND_EXPR)
- copy_bind_expr (tp, walk_subtrees, id);
-   else if (TREE_CODE (*tp) == SAVE_EXPR
-  || TREE_CODE (*tp) == TARGET_EXPR)
- remap_save_expr (tp, st, walk_subtrees);
-   else
- {
-   copy_tree_r (tp, walk_subtrees, NULL);
- 
-   /* Do whatever unsaving is required.  */
-   unsave_expr_1 (*tp);
- }
- 
-   /* Keep iterating.  */
-   return NULL_TREE;
- }
- 
- /* Copies everything in EXPR and replaces variables, labels
-and SAVE_EXPRs local to EXPR.  */
- 
- tree
- unsave_expr_now (tree expr)
- {
-   copy_body_data id;
- 
-   /* There's nothing to do for NULL_TREE.  */
-   if (expr == 0)
- return expr;
- 
-   /* Set up ID.  */
-   memset (id, 0, sizeof (id));
-   id.src_fn = current_function_decl;
-   id.dst_fn =

Re: [PATCH] Don't forwprop into clobbers in some cases (PR tree-optimization/56854)

2013-04-09 Thread Richard Biener

On Mon, 8 Apr 2013, Jakub Jelinek wrote:

 Hi!
 
 lhs ={v} {CLOBBER};
 stmts right now allow only VAR_DECL or MEM_REF lhs, but the forwprop code
 below on the attached testcase attempts to propagate an ARRAY_REF (of
 MEM_REF) into it.  Fixed by not propagating in that case, allowing arbitrary
 memory lhs is IMHO unnecessary and such lhs's wouldn't be very useful for
 DSE anyway.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2013-04-08  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/56854
   * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Don't
   forward into clobber stmts if it would change MEM_REF lhs into
   non-MEM_REF.
 
   * g++.dg/torture/pr56854.C: New test.
 
 --- gcc/tree-ssa-forwprop.c.jj2013-02-25 23:51:21.0 +0100
 +++ gcc/tree-ssa-forwprop.c   2013-04-08 16:12:37.0 +0200
 @@ -826,7 +826,11 @@ forward_propagate_addr_expr_1 (tree name
   integer_zerop (TREE_OPERAND (lhs, 1))
   useless_type_conversion_p
   (TREE_TYPE (TREE_OPERAND (def_rhs, 0)),
 -  TREE_TYPE (gimple_assign_rhs1 (use_stmt
 +  TREE_TYPE (gimple_assign_rhs1 (use_stmt)))
 +/* Don't forward anything into clobber stmts if it would result
 +   in the lhs no longer being a MEM_REF.  */
 + (!gimple_clobber_p (use_stmt)
 +|| TREE_CODE (TREE_OPERAND (def_rhs, 0)) == MEM_REF))
   {
 tree *def_rhs_basep = TREE_OPERAND (def_rhs, 0);
 tree new_offset, new_base, saved, new_lhs;
 --- gcc/testsuite/g++.dg/torture/pr56854.C.jj 2013-04-08 18:03:37.978009666 
 +0200
 +++ gcc/testsuite/g++.dg/torture/pr56854.C2013-04-08 18:03:09.0 
 +0200
 @@ -0,0 +1,24 @@
 +// PR tree-optimization/56854
 +// { dg-do compile }
 +
 +inline void *
 +operator new (__SIZE_TYPE__, void *p) throw ()
 +{
 +  return p;
 +}
 +
 +struct A
 +{
 +  int a;
 +  A () : a (0) {}
 +  ~A () {}
 +  A operator= (const A v) { this-~A (); new (this) A (v); return *this; }
 +};
 +A b[4], c[4];
 +
 +void
 +foo ()
 +{
 +  for (int i = 0; i  4; ++i)
 +c[i] = b[i];
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: Make lto-symtab to ignore conflicts in static functions

2013-04-09 Thread Richard Biener

On Mon, 8 Apr 2013, Jan Hubicka wrote:

 Hi,
 currently lto-symtab is trying to resolve all duplicated declarations,
 including static variables where such duplicates should not happen.
 This conflicts with the plan to solve PR54095 by postponning renaming to
 the partitioning.  This patch adds lto_symtab_symbol_p that disable merging
 on statics and keeps duplicate entries for a given asm name.
 
 Boostrapped/regtested x86_64-linux, OK?

Ok.

Thanks,
Richard.

 Honza
 
   PR lto/54095
   lto-symtab.c (lto_symtab_symbol_p): New function.
   (lto_symtab_resolve_can_prevail_p, lto_symtab_resolve_symbols,
   lto_symtab_resolve_symbols, lto_symtab_merge_decls_2,
   lto_symtab_merge_decls_1, lto_symtab_merge_cgraph_nodes_1):
   Skip static symbols.
   
 Index: lto-symtab.c
 ===
 *** lto-symtab.c  (revision 197551)
 --- lto-symtab.c  (working copy)
 *** lto_symtab_resolve_replaceable_p (symtab
 *** 226,237 
 return false;
   }
   
   /* Return true if the symtab entry E can be the prevailing one.  */
   
   static bool
   lto_symtab_resolve_can_prevail_p (symtab_node e)
   {
 !   if (!symtab_real_symbol_p (e))
   return false;
   
 /* The C++ frontend ends up neither setting TREE_STATIC nor
 --- 226,249 
 return false;
   }
   
 + /* Return true, if the symbol E should be resolved by lto-symtab.
 +Those are all real symbols that are not static (we handle renaming
 +of static later in partitioning).  */
 + 
 + static bool
 + lto_symtab_symbol_p (symtab_node e)
 + {
 +   if (!TREE_PUBLIC (e-symbol.decl))
 + return false;
 +   return symtab_real_symbol_p (e);
 + }
 + 
   /* Return true if the symtab entry E can be the prevailing one.  */
   
   static bool
   lto_symtab_resolve_can_prevail_p (symtab_node e)
   {
 !   if (!lto_symtab_symbol_p (e))
   return false;
   
 /* The C++ frontend ends up neither setting TREE_STATIC nor
 *** lto_symtab_resolve_symbols (symtab_node
 *** 261,267 
   
 /* Always set e-node so that edges are updated to reflect decl merging. 
 */
 for (e = first; e; e = e-symbol.next_sharing_asm_name)
 ! if (symtab_real_symbol_p (e)
(e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
   || e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
   || e-symbol.resolution == LDPR_PREVAILING_DEF))
 --- 273,279 
   
 /* Always set e-node so that edges are updated to reflect decl merging. 
 */
 for (e = first; e; e = e-symbol.next_sharing_asm_name)
 ! if (lto_symtab_symbol_p (e)
(e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
   || e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
   || e-symbol.resolution == LDPR_PREVAILING_DEF))
 *** lto_symtab_resolve_symbols (symtab_node
 *** 275,281 
   {
 /* Assert it's the only one.  */
 for (e = prevailing-symbol.next_sharing_asm_name; e; e = 
 e-symbol.next_sharing_asm_name)
 ! if (symtab_real_symbol_p (e)
(e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
   || e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
   || e-symbol.resolution == LDPR_PREVAILING_DEF))
 --- 287,293 
   {
 /* Assert it's the only one.  */
 for (e = prevailing-symbol.next_sharing_asm_name; e; e = 
 e-symbol.next_sharing_asm_name)
 ! if (lto_symtab_symbol_p (e)
(e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
   || e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
   || e-symbol.resolution == LDPR_PREVAILING_DEF))
 *** lto_symtab_resolve_symbols (symtab_node
 *** 310,317 
 /* Do a second round choosing one from the replaceable prevailing decls.  
 */
 for (e = first; e; e = e-symbol.next_sharing_asm_name)
   {
 !   if (!lto_symtab_resolve_can_prevail_p (e)
 !   || !symtab_real_symbol_p (e))
   continue;
   
 /* Choose the first function that can prevail as prevailing.  */
 --- 322,328 
 /* Do a second round choosing one from the replaceable prevailing decls.  
 */
 for (e = first; e; e = e-symbol.next_sharing_asm_name)
   {
 !   if (!lto_symtab_resolve_can_prevail_p (e))
   continue;
   
 /* Choose the first function that can prevail as prevailing.  */
 *** lto_symtab_merge_decls_2 (symtab_node fi
 *** 365,375 
 /* Try to merge each entry with the prevailing one.  */
 for (e = prevailing-symbol.next_sharing_asm_name;
  e; e = e-symbol.next_sharing_asm_name)
 ! {
 !   if (!lto_symtab_merge (prevailing, e)
 !!diagnosed_p)
 ! mismatches.safe_push (e-symbol.decl);
 ! }
 if (mismatches.is_empty ())
   return;
   
 --- 376,387 
 /* Try to merge each entry with the prevailing one.  */
 for (e = prevailing-symbol.next_sharing_asm_name;

Re: [PATCH, combine] Fix host-specific behavior in simplify_compare_const()

2013-04-09 Thread Eric Botcazou

 The attached patch has passed bootstrap/regression test
 on x86_64-unknown-linux-gnu with current main trunk.
 A plaintext gcc/ChangeLog is as below:
 
 
 2013-04-06  Chung-Ju Wu  jasonw...@gmail.com
 
 * combine.c (simplify_compare_const): Use GET_MODE_MASK to filter
 out unnecessary bits in the constant power of two case.
 
 
 On behalf of Andes Technology Co., we have signed FSF agreement.
 However, so far I don't have svn write access yet.
 Would you please help to commit this patch?

Jeff has kindly offered to sponsor you for write access, so you should be able 
to install it yourself.  Otherwise I'll be happy to do it for you.

Thanks for fixing this problem in the combiner.

-- 
Eric Botcazou

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Richard Biener

On Mon, Apr 8, 2013 at 5:10 PM, Joern Rennecke
joern.renne...@embecosm.com wrote:
 This is basically the same patch as attached to the PR, except that I
 have changed the goto-loop into a do-while loop with a new comment;
 this caused the need for a lot of reformatting.

Can you please include a testcase that shows the effect of this?

 bootstrapped  regtested on i686-pc-linux-gnu.

 2013-04-08  Joern Rennecke  joern.renne...@embecosm.com

 * tree-ssa-math-opts.c (mult_to_fma_pass): New file static struct.
 (convert_mult_to_fma): In first pass, don't use an fms construct
 when we don't have an fms operation, but fmna.

it's fnma I believe.

 (execute_optimize_widening_mul): Add a second pass if
 convert_mult_to_fma requests it.

 Index: gcc/tree-ssa-math-opts.c
 ===
 --- gcc/tree-ssa-math-opts.c(revision 197578)
 +++ gcc/tree-ssa-math-opts.c(working copy)
 @@ -2461,6 +2461,12 @@ convert_plusminus_to_widen (gimple_stmt_
return true;
  }

 +static struct
 +{
 +  bool second_pass;
 +  bool retry_request;
 +} mult_to_fma_pass;
 +
  /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2
 with uses in additions and subtractions to form fused multiply-add
 operations.  Returns true if successful and MUL_STMT should be removed.
 */
 @@ -2570,6 +2576,22 @@ convert_mult_to_fma (gimple mul_stmt, tr
   return false;
 }

 +  /* If the subtrahend (gimple_assign_rhs2 (use_stmt)) is computed
 +by a MULT_EXPR that we'll visit later, we might be able to
 +get a more profitable match with fnma.
 +OTOH, if we don't, a negate / fma pair has likely lower latency
 +that a mult / subtract pair.  */

This makes it sound that this is purely an ordering issue, thus for

  x = a * b;
  y = c * d;
  z = x - y;

instead of

  y = c * d;
  z = a * b + (-y);

you want to generate

  x = a * b;
  z = -(c * d + (-x));

I fail to see why you need two passes for this rather than considering the
case that the immediate use stmt of the multiplication we start from
combines another multiplication with a MINUS_EXPR.  That is ...

 +  if (use_code == MINUS_EXPR  !negate_p
 +  gimple_assign_rhs1 (use_stmt) == result
 +  optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
 +  optab_handler (fnma_optab, TYPE_MODE (type)) !=
 CODE_FOR_nothing
 +  mult_to_fma_pass.second_pass == false)

see if that is the case here and simply not do the transform.  A second pass
will not recover from that without destroying the fnma pattern (testcase?)

Richard.

 +   {
 + /* ??? Could make setting of retry_request dependent on some
 +rtx_cost measure we evaluate beforehand.  */
 + mult_to_fma_pass.retry_request = true;
 + return false;
 +   }
/* We can't handle a * b + a * b.  */
if (gimple_assign_rhs1 (use_stmt) == gimple_assign_rhs2 (use_stmt))
 return false;
 @@ -2657,76 +2679,89 @@ execute_optimize_widening_mul (void)

memset (widen_mul_stats, 0, sizeof (widen_mul_stats));

 -  FOR_EACH_BB (bb)
 -{
 -  gimple_stmt_iterator gsi;

 -  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
 -{
 - gimple stmt = gsi_stmt (gsi);
 - enum tree_code code;
 +  /* We may run one or two passes.  In the first pass, if have fnma,
 + but not fms, we don't synthesize fms so that we can get the maximum
 + matches for fnma.  If we have therefore skipped opportunities to
 + synthesize fms, we'll run a second pass where we use any such
 + opportunities that still remain.  */
 +  mult_to_fma_pass.retry_request = false;
 +  do
 +{
 +  mult_to_fma_pass.second_pass = mult_to_fma_pass.retry_request;
 +  FOR_EACH_BB (bb)
 +   {
 + gimple_stmt_iterator gsi;

 - if (is_gimple_assign (stmt))
 + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
 {
 - code = gimple_assign_rhs_code (stmt);
 - switch (code)
 + gimple stmt = gsi_stmt (gsi);
 + enum tree_code code;
 +
 + if (is_gimple_assign (stmt))
 {
 -   case MULT_EXPR:
 - if (!convert_mult_to_widen (stmt, gsi)
 -  convert_mult_to_fma (stmt,
 - gimple_assign_rhs1 (stmt),
 - gimple_assign_rhs2 (stmt)))
 + code = gimple_assign_rhs_code (stmt);
 + switch (code)
 {
 - gsi_remove (gsi, true);
 - release_defs (stmt);
 - continue;
 -   }
 - break;
 -
 -   case PLUS_EXPR:
 -   case MINUS_EXPR:
 - convert_plusminus_to_widen (gsi, stmt, code);
 -

Re: useless cast blocking some optimization in gcc 4.7.3

2013-04-09 Thread Richard Biener

On Mon, Apr 8, 2013 at 9:13 PM, Laurent Alfonsi laurent.alfo...@st.com wrote:
 Hello,

 I have identified a big performance regression between 4.6 and 4.7. (I have
 enclosed a pathological test).

 After investigation, it is because of the += statement applied on 2 signed
 chars.
   - It is now type-promoted to int when it is written result += foo().
 (since 4.7)
   - it is type promoted to unsigned char when it is written result =
 result + foo().

 The char-int-char cast is blocking some optimizations in later phases.
 Anyway, this doesn't look wrong, so I extended fold optimization in order to
 catch this case. (patch enclosed)
 The patch basically transforms :
 (TypeA)  ( (TypeB) a1 + (TypeB) a2 )/* with a1 and a2 of
 the signed type TypeA */
 into :
 a1 + a2

 I believe this is legal for any licit a1/a2 input values (no overflow on
 signed char).
 No new failure on the two tested targets : sh-superh-elf and
 x86_64-unknown-linux-gnu.
 Should I enter a bugzilla to track this ? Is it ok for trunk ?

Please open a bugzilla.  No, the patch is not ok, as said by Marc.
It's possible to shorten the operation by performing it using
unsigned arithmetic, that is (signed)((unsigned)a1 + (unsigned)a2).
But if really casts cause the issue you are seeing that will not
help you.

Richard.

 2013-04-08  Laurent Alfonsi  laurent.alfo...@st.com

* fold-const.c (fold_unary_loc): Suppress useless type promotion.


 Thanks,
 Laurent

Re: [i386] Replace builtins with vector extensions

2013-04-09 Thread Richard Biener

On Mon, Apr 8, 2013 at 10:47 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Sun, 7 Apr 2013, Marc Glisse wrote:

  extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__,
 __artificial__))
  _mm_slli_epi16 (__m128i __A, int __B)
  {
 -  return (__m128i)__builtin_ia32_psllwi128 ((__v8hi)__A, __B);
 +  return (__m128i) ((__v8hi)__A  __B);
  }


 Actually, I believe I have to keep using the builtins for shifts, because
 the intrinsics have well defined behavior for large __B whereas  and 
 don't.

I seem to remember discussion in the PR(s) that the intrinsics should
(and do for other compilers) expand to the desired instructions even when
the corresponding instruction set is disabled.  Using vector extension
makes that harder to achieve.  Other than that I am all for using the
vector extensions, but I think you need carefully wrapped __extension__
markers so that with -std=c89 -pedantic you still can compile programs
using the intrinsics?

Richard.

 --
 Marc Glisse

Re: [patch][sparc] define_c_enum for UNSPEC/UNSPECV

2013-04-09 Thread Eric Botcazou

 * config/sparc/sparc.md: Use define_c_enum for unspec and
 unspecv.

OK, thanks.

-- 
Eric Botcazou

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-09 Thread Richard Biener

On Mon, Apr 8, 2013 at 10:39 PM, Lawrence Crowl cr...@google.com wrote:
 On 4/8/13, Kenneth Zadeck zad...@naturalbridge.com wrote:
 The other problem, which i invite you to use the full power of
 your c++ sorcery on, is the one where defining an operator so
 that wide-int + unsigned hwi is either rejected or properly
 zero extended.  If you can do this, I will go along with
 your suggestion that the internal rep should be sign extended.
 Saying that constants are always sign extended seems ok, but there
 are a huge number of places where we convert unsigned hwis as
 the second operand and i do not want that to be a trap.  I went
 thru a round of this, where i did not post the patch because i
 could not make this work.  And the number of places where you
 want to use an hwi as the second operand dwarfs the number of
 places where you want to use a small integer constant.

 You can use overloading, as in the following, which actually ignores
 handling the sign in the representation.

 class number {
 unsigned int rep1;
 int representation;
 public:
 number(int arg) : representation(arg) {}
 number(unsigned int arg) : representation(arg) {}
 friend number operator+(number, int);
 friend number operator+(number, unsigned int);
 friend number operator+(int, number);
 friend number operator+(unsigned int, number);
 };

 number operator+(number n, int si) {
 return n.representation + si;
 }

 number operator+(number n, unsigned int ui) {
 return n.representation + ui;
 }

 number operator+(int si, number n) {
 return n.representation + si;
 }

 number operator+(unsigned int ui, number n) {
 return n.representation + ui;
 }

That does not work for types larger than int/unsigned int as HOST_WIDE_INT
usually is (it's long / unsigned long).  When you pass an int or unsigned int
to

number operator+(unsigned long ui, number n);
number operator+(long ui, number n)

you get an ambiguity.  You can fix that by relying on template argument
deduction and specialization instead of on overloading and integer conversion
rules.

 If the argument type is of a template type parameter, then
 you can test the template type via

 if (std::is_signedT::value)
    // sign extend
 else
    // zero extend

 See http://www.cplusplus.com/reference/type_traits/is_signed/.

Yes, if we want to use the standard library.  For what integer types
is std::is_signed required to be implemented in C++98 (long long?)?
Consider non-GCC host compilers.

Richard.

 If you want to handle non-builtin types that are asigne dor unsigned,
 then you need to add a specialization for is_signed.

 --
 Lawrence Crowl

Re: [RFA][PATCH] Improve VRP of COND_EXPR_CONDs -- v2

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 3:54 AM, Jeff Law l...@redhat.com wrote:

 This incorporates the concrete suggestions from Steven  Richi -- it doesn't
 do any refactoring of the VRP code.  There's still stuff I'm looking at that
 might directly lead to some refactoring.  In the mean time I'm submitting
 the obvious small improvements.

 Bootstrapped and regression tested on x86_64-unknown-linux-gnu.

 OK for trunk?

Ok.

Thanks,
Richard.

 Jeff

 commit d6d1e36561b9022bbcdf157a886895f5bb0ef2ae
 Author: Jeff Law l...@redhat.com
 Date:   Sat Apr 6 06:46:58 2013 -0600

* tree-vrp.c (simplify_cond_using_ranges): Simplify test of
 boolean
when the boolean was created by converting a wider object which
had a boolean range.

 * gcc.dg/tree-ssa/vrp87.c: New test

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 6ee7d9c..110f61e 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -16,8 +16,14 @@

  2013-04-08  Jeff Law  l...@redhat.com

 +   * tree-vrp.c (simplify_cond_using_ranges): Simplify test of boolean
 +   when the boolean was created by converting a wider object which
 +   had a boolean range.
 +
 +2013-04-08  Jeff Law  l...@redhat.com
 +
 * gimple.c (canonicalize_cond_expr_cond): Rewrite x ^ y into x != y.
 -
 +
  2013-04-08  Richard Biener  rguent...@suse.de

 * gimple-pretty-print.c (debug_gimple_stmt): Do not print
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 new file mode 100644
 index 000..7feff81
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 @@ -0,0 +1,81 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-vrp2-details -fdump-tree-cddce2-details }
 */
 +
 +struct bitmap_head_def;
 +typedef struct bitmap_head_def *bitmap;
 +typedef const struct bitmap_head_def *const_bitmap;
 +
 +
 +typedef unsigned long BITMAP_WORD;
 +typedef struct bitmap_element_def
 +{
 +  struct bitmap_element_def *next;
 +  unsigned int indx;
 +  BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))];
 +} bitmap_element;
 +
 +
 +
 +
 +
 +
 +typedef struct bitmap_head_def
 +{
 +  bitmap_element *first;
 +
 +} bitmap_head;
 +
 +
 +
 +static __inline__ unsigned char
 +bitmap_elt_ior (bitmap dst, bitmap_element * dst_elt,
 +   bitmap_element * dst_prev, const bitmap_element * a_elt,
 +   const bitmap_element * b_elt, unsigned char changed)
 +{
 +
 +  if (a_elt)
 +{
 +
 +  if (!changed  dst_elt)
 +   {
 + changed = 1;
 +   }
 +}
 +  else
 +{
 +  changed = 1;
 +}
 +  return changed;
 +}
 +
 +unsigned char
 +bitmap_ior_into (bitmap a, const_bitmap b)
 +{
 +  bitmap_element *a_elt = a-first;
 +  const bitmap_element *b_elt = b-first;
 +  bitmap_element *a_prev = ((void *) 0);
 +  unsigned char changed = 0;
 +
 +  while (b_elt)
 +{
 +
 +  if (!a_elt || a_elt-indx == b_elt-indx)
 +   changed = bitmap_elt_ior (a, a_elt, a_prev, a_elt, b_elt, changed);
 +  else if (a_elt-indx  b_elt-indx)
 +   changed = 1;
 +  b_elt = b_elt-next;
 +
 +
 +}
 +
 +  return changed;
 +}
 +
 +/* Verify that VRP simplified an if statement.  */
 +/* { dg-final { scan-tree-dump Folded into: if.* vrp2} } */
 +/* Verify that DCE after VRP2 eliminates a dead conversion
 +   to a (Bool).  */
 +/* { dg-final { scan-tree-dump Deleting.*_Bool.*; cddce2} } */
 +/* { dg-final { cleanup-tree-dump vrp2 } } */
 +/* { dg-final { cleanup-tree-dump cddce2 } } */
 +
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index 250a506..4520c89 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -8584,6 +8584,45 @@ simplify_cond_using_ranges (gimple stmt)
 }
  }

 +  /* If we have a comparison of a SSA_NAME boolean against
 + a constant (which obviously must be [0..1]), see if the
 + SSA_NAME was set by a type conversion where the source
 + of the conversion is another SSA_NAME with a range [0..1].
 +
 + If so, we can replace the SSA_NAME in the comparison with
 + the RHS of the conversion.  This will often make the type
 + conversion dead code which DCE will clean up.  */
 +  if (TREE_CODE (op0) == SSA_NAME
 +   (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
 + || (INTEGRAL_TYPE_P (TREE_TYPE (op))
 +  TYPE_PRECISION (TREE_TYPE (op0)) == 1))
 +   TREE_CODE (op1) == INTEGER_CST)
 +{
 +  gimple def_stmt = SSA_NAME_DEF_STMT (op0);
 +  tree innerop;
 +
 +  if (!is_gimple_assign (def_stmt)
 + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
 +   return false;
 +
 +  innerop = gimple_assign_rhs1 (def_stmt);
 +
 +  if (TREE_CODE (innerop) == SSA_NAME)
 +   {
 + value_range_t *vr = get_value_range (innerop);
 +
 + if (range_int_cst_p (vr)
 +  operand_equal_p (vr-min, integer_zero_node, 0)
 +  operand_equal_p (vr-max, integer_one_node, 0))
 +   {
 + tree newconst = fold_convert

Re: [PATCH v3]IPA: fixing inline fail report caused by overwritable functions.

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 5:40 AM, Zhouyi Zhou zhouzho...@gmail.com wrote:
 On Mon, Apr 8, 2013 at 5:48 PM, Richard Biener richard.guent...@gmail.com 
 wrote:
Can you trigger this message to show up with -Winline before/after the patch?
Can you please add a testcase then?
 Thanks Richard for reviewing, from my point of view about gcc and my invoking 
 of gcc, -Winline
 only works on callees that be declared inline, but if the callee is declared
 inline, it will be AVAIL_AVAILABLE in function can_inline_edge_p, thus out 
 of the
 range of my patch.

Ah, indeed ...

 So I only add a testcase for fixing the tree dump, are there any thing more I 
 can do?

No.  Patch is ok.

Thanks,
Richard.

 Regtested/bootstrapped on x86_64-linux

 ChangeLog:
 2013-04-08 Zhouyi Zhou yizhouz...@ict.ac.cn
* cif-code.def (OVERWRITABLE): correct the comment for overwritable
 function
* ipa-inline.c (can_inline_edge_p): let dump mechanism report the 
 inline
fail caused by overwritable functions.
* gcc.dg/tree-ssa/inline-11.c: New test

 Index: gcc/cif-code.def
 ===
 --- gcc/cif-code.def(revision 197549)
 +++ gcc/cif-code.def(working copy)
 @@ -48,7 +48,7 @@ DEFCIFCODE(REDEFINED_EXTERN_INLINE,
  /* Function is not inlinable.  */
  DEFCIFCODE(FUNCTION_NOT_INLINABLE, N_(function not inlinable))

 -/* Function is not overwritable.  */
 +/* Function is overwritable.  */
  DEFCIFCODE(OVERWRITABLE, N_(function body can be overwritten at link time))

  /* Function is not an inlining candidate.  */
 Index: gcc/testsuite/gcc.dg/tree-ssa/inline-11.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (working copy)
 @@ -0,0 +1,13 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-einline } */
 +int w;
 +int bar (void) __attribute__ ((weak));
 +int bar (){
 +  w++;
 +}
 +void foo()
 +{
 +  bar();
 +}
 +/* { dg-final { scan-tree-dump-times function body can be overwritten at 
 link time 1 einline } } */
 +/* { dg-final { cleanup-tree-dump einline } } */
 Index: gcc/ipa-inline.c
 ===
 --- gcc/ipa-inline.c(revision 197549)
 +++ gcc/ipa-inline.c(working copy)
 @@ -266,7 +266,7 @@ can_inline_edge_p (struct cgraph_edge *e
else if (avail = AVAIL_OVERWRITABLE)
  {
e-inline_failed = CIF_OVERWRITABLE;
 -  return false;
 +  inlinable = false;
  }
else if (e-call_stmt_cannot_inline_p)
  {

Re: [patch][sparc] remove sparc machine_reorg, add machine specifc pass after delayed-branch scheduling

2013-04-09 Thread Eric Botcazou

 Bootstrapped and tested on sparc64-unknown-linux-gnu.
 OK for trunk?

OK modulo a couple of nits:

@@ -804,6 +801,136 @@ char sparc_hard_reg_printed[8];
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
+/* We use the machine specific reorg pass to enable workarounds for errata.
+   We need to have the (essentially) final form of the insn stream in order
+   to properly detect the various hazards.  Therefore, this machine specific
+   pass runs as late as possible.  The pass is inserted in the pass pipeline
+   at the end of sparc_options_override().  */

We use a machine specific pass to enable...  No () for functions in comments.

-- 
Eric Botcazou

Re: [PATCH, trivial] Make type_hash_lookup, type_hash_add static

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 10:20 AM, Maxim Kuvyrkov ma...@kugelworks.com wrote:
 This patch makes type_hash_lookup and type_hash_add in tree.c static.  These 
 functions are used only within tree.c, and I don't think there is any reason 
 to export low-level interface to type hashing anyway.

 Will commit in 2 days unless someone complains.  Bootstrapped on 
 x86_64-linux-gnu.

Ok.

Thanks,
Richard.

 Thanks,

 --
 Maxim Kuvyrkov
 KugelWorks

Re: [i386] Replace builtins with vector extensions

2013-04-09 Thread Marc Glisse


On Tue, 9 Apr 2013, Richard Biener wrote:


On Mon, Apr 8, 2013 at 10:47 PM, Marc Glisse marc.gli...@inria.fr wrote:

On Sun, 7 Apr 2013, Marc Glisse wrote:


 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
 _mm_slli_epi16 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psllwi128 ((__v8hi)__A, __B);
+  return (__m128i) ((__v8hi)__A  __B);
 }



Actually, I believe I have to keep using the builtins for shifts, because
the intrinsics have well defined behavior for large __B whereas  and 
don't.


I seem to remember discussion in the PR(s) that the intrinsics should
(and do for other compilers) expand to the desired instructions even when
the corresponding instruction set is disabled.


emmintrin.h starts with:
#ifndef __SSE2__
# error SSE2 instruction set not enabled

The closest thing I can think of is issues with -mfpmath=387, but that 
shouldn't matter for full vector ops.



Using vector extension
makes that harder to achieve.  Other than that I am all for using the
vector extensions, but I think you need carefully wrapped __extension__
markers so that with -std=c89 -pedantic you still can compile programs
using the intrinsics?


The *intrin.h files already use __extension__ to create vectors, like:
  return __extension__ (__m128d){ __F, 0.0 };
but even when I remove it it does not warn with -std=c89 -pedantic.

--
Marc Glisse

Re: [patch] Hash table changes from cxx-conversion branch

2013-04-09 Thread Richard Biener

On Mon, Apr 8, 2013 at 11:45 PM, Lawrence Crowl cr...@googlers.com wrote:
 Ping?

You didn't commit the ones I already approved?  I don't want to go over
them again ...

Richard.

 On 3/31/13, Lawrence Crowl cr...@googlers.com wrote:
 On 3/28/13, Richard Biener richard.guent...@gmail.com wrote:
 On Mar 27, 2013 Lawrence Crowl cr...@googlers.com wrote:
  On 3/27/13, Richard Biener richard.guent...@gmail.com wrote:
   On Mar 23, 2013 Lawrence Crowl cr...@googlers.com wrote:
This patch is a consolodation of the hash_table patches to
the cxx-conversion branch.
   
Update various hash tables from htab_t to hash_table.
Modify types and calls to match.
  
   Ugh.  Can you split it up somewhat ... like split target bits
   away at least?  Targets may prefer to keep the old hashes for
   ease of branch maintainance.
 
  I will do that.
 
* tree-ssa-live.c'var_map_base_init::tree_to_index
   
New struct tree_int_map_hasher.
  
   I think this wants to be generalized - we have the common
   tree_map/tree_decl_map and tree_int_map maps in tree.h -
   those (and its users) should be tackled in a separate patch
   by providing common hashtable trails implementations.
 
  I will investigate for a separate patch.
 
Remove unused:
   
htab_t scop::original_pddrs
SCOP_ORIGINAL_PDDRS
   
Remove unused:
   
insert_loop_close_phis
insert_guard_phis
debug_ivtype_map
ivtype_map_elt_info
new_ivtype_map_elt
  
   Unused function/type removal are obvious changes.
  
Remove unused:
dse.c bitmap clear_alias_sets
dse.c bitmap disqualified_clear_alias_sets
dse.c alloc_pool clear_alias_mode_pool
dse.c dse_step2_spill
dse.c dse_step5_spill
graphds.h htab_t graph::indices
  
   See above.
 
  It wasn't obvious that the functions could be removed.  :-)
 
  Are you saying you don't want these notations in the description?

 No, I was saying that removal of unused functions / types should be
 committed separately and do not need approval as they are obvious.
 If they are not obvious (I didn't look at that patch part),
 then posting separately still helps ;)

 I've split out the removals to separate patches.  The remaining
 work is in two independent pieces.  The changes within the config
 directory and the changes outside that directory.  The descriptions
 and patch are attached compressed due to mailer size issues.

 Okay for trunk?

 --
 Lawrence Crowl



 --
 Lawrence Crowl

Re: [i386] Replace builtins with vector extensions

2013-04-09 Thread Jakub Jelinek

On Tue, Apr 09, 2013 at 11:08:38AM +0200, Marc Glisse wrote:
 The *intrin.h files already use __extension__ to create vectors, like:
   return __extension__ (__m128d){ __F, 0.0 };
 but even when I remove it it does not warn with -std=c89 -pedantic.

Even with -Wsystem-headers ?

Jakub

Re: [i386] Replace builtins with vector extensions

2013-04-09 Thread Marc Glisse


On Tue, 9 Apr 2013, Jakub Jelinek wrote:


On Tue, Apr 09, 2013 at 11:08:38AM +0200, Marc Glisse wrote:

The *intrin.h files already use __extension__ to create vectors, like:
  return __extension__ (__m128d){ __F, 0.0 };
but even when I remove it it does not warn with -std=c89 -pedantic.


Even with -Wsystem-headers ?


Oups ;-)

Ok, removing the existing __extension__ causes warnings (note that it can 
easily be worked around by initializing a variable instead of this 
compound literal, so it isn't vectors that pedantic complains about), but 
my changes do not warn.


--
Marc Glisse

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-09 Thread Florian Weimer


On 04/09/2013 01:47 AM, Robert Dewar wrote:

Well the back end has all the information to figure this out I think!
But anyway, for Ada, the current situation is just fine, and has
the advantage that the -gnatG expanded code listing clearly shows in
Ada source form, what is going on.


Isn't this a bit optimistic, considering that run-time overflow checking 
currently does not use existing hardware support?


--
Florian Weimer / Red Hat Product Security Team

Re: [PATCH] Fix PR48762

2013-04-09 Thread Eric Botcazou

 Alternatively, we can bump the minimum of that param, as usual ;)

Let's do that and bump it to 1, my understanding is that 0 and 1 are 
equivalent for this param.

-- 
Eric Botcazou

Re: [Fortran, RFC patch] Document naming and argument passing convention

2013-04-09 Thread Janne Blomqvist

On Mon, Apr 8, 2013 at 5:38 PM, Tobias Burnus bur...@net-b.de wrote:
 Dear all,

 attached is an updated version of the patch, which address the raised issues
 and some minor problems and omissions I found.

 OK for the trunk?

+For Boolean (@code{LOGICAL}) arguments, please note that GCC expects
+only the integer value 0 and 1.  If a GNU Fortran @code{LOGICAL}
+variable contains another integer value, the result is undefined.
+As some other Fortran compilers use @math{-1} for @code{.TRUE.},
+extra care has to be taken -- such as passing the value as
+@code{INTEGER}.  (The same value restriction also applies to other
+front ends of GCC, e.g. to GCC's C99 compiler for @code{_Bool}
+or GCC's Ada compiler for @code{Complex}.)

Presumably you meant Ada's @code{Bool} (or whatever the Ada boolean
type is called)?

Ok with that change.

--
Janne Blomqvist

Re: [Fortran, RFC patch] Document naming and argument passing convention

2013-04-09 Thread Tobias Burnus


Janne Blomqvist wrote:

+For Boolean (@code{LOGICAL}) arguments, please note that GCC expects
+only the integer value 0 and 1.  If a GNU Fortran @code{LOGICAL}
+variable contains another integer value, the result is undefined.
+As some other Fortran compilers use @math{-1} for @code{.TRUE.},
+extra care has to be taken -- such as passing the value as
+@code{INTEGER}.  (The same value restriction also applies to other
+front ends of GCC, e.g. to GCC's C99 compiler for @code{_Bool}
+or GCC's Ada compiler for @code{Complex}.)

Presumably you meant Ada's @code{Bool} (or whatever the Ada boolean
type is called)?
Ok with that change.


Boolean. Committed as Rev. 197624.

Thanks for the review!

Tobias

Re: [patch libgcc]: Adjust cygming-crtbegin code to use weak

2013-04-09 Thread Kai Tietz

2013/4/9 Dave Korn dave.korn.cyg...@gmail.com:
 On 22/03/2013 08:44, Kai Tietz wrote:

 2013-03-22  Kai Tietz  kti...@redhat.com

   * config/i386/cygming-crtbegin.c (__register_frame_info): Make weak.
   (__deregister_frame_info): Likewise.

 Hi Kai,

   I read your explanation of the problem relating to x86-64 memory models over
 on the Cygwin dev list, and that explained your motivation for making this
 change; I see why it's not easy to get an *ABS* 0 reference there.  So,
 providing dummy versions of the functions makes perfect sense to me, and
 certainly won't cause problems for i686.  (I did a lot of testing, and the
 only problem I found is that a weak definition has to be provided on the
 linker command line *after* the file that contains the weak-with-zero-default
 definition if it is to override that; in the case here however we're going to
 be overriding the weak-with-default by a strong function declaration, so that
 issue does not arise.)

   I still have a comment or two about the patch itself:

 Index: libgcc/config/i386/cygming-crtbegin.c
 ===
 --- libgcc/config/i386/cygming-crtbegin.c (Revision 196898)
 +++ libgcc/config/i386/cygming-crtbegin.c (Arbeitskopie)
 @@ -46,15 +46,33 @@ see the files COPYING3 and COPYING.RUNTIME respect
  #define LIBGCJ_SONAME libgcj_s.dll
  #endif

 -
 +#if DWARF2_UNWIND_INFO
  /* Make the declarations weak.  This is critical for
 _Jv_RegisterClasses because it lives in libgcj.a  */
 -extern void __register_frame_info (const void *, struct object *)
 +extern void __register_frame_info (__attribute__((unused)) const void *,
 +__attribute__((unused)) struct object *)
  TARGET_ATTRIBUTE_WEAK;
 -extern void *__deregister_frame_info (const void *)
 +extern void *__deregister_frame_info (__attribute__((unused)) const void *)
 TARGET_ATTRIBUTE_WEAK;
 -extern void _Jv_RegisterClasses (const void *) TARGET_ATTRIBUTE_WEAK;
 +TARGET_ATTRIBUTE_WEAK void
 +__register_frame_info (__attribute__((unused)) const void *p,
 +__attribute__((unused)) struct object *o)
 +{}

   Braces should go on separate lines I think.

 +TARGET_ATTRIBUTE_WEAK void *
 +__deregister_frame_info (__attribute__((unused)) const void *p)
 +{ return (void*) 0; }

   Certainly here.

 +#endif /* DWARF2_UNWIND_INFO */
 +
 +#if TARGET_USE_JCR_SECTION
 +extern void _Jv_RegisterClasses (__attribute__((unused)) const void *)
 +  TARGET_ATTRIBUTE_WEAK;
 +
 +TARGET_ATTRIBUTE_WEAK void
 +_Jv_RegisterClasses (__attribute__((unused)) const void *p)
 +{}
 +#endif /* TARGET_USE_JCR_SECTION */
 +
  #if defined(HAVE_LD_RO_RW_SECTION_MIXING)
  # define EH_FRAME_SECTION_CONST const
  #else

   Also, now that you've provided a default weak definition of the functions in
 the file itself, it's no longer possible for the function pointer variables
 (register_frame_fn, register_class_fn, deregister_frame_fn) to be zero, so you
 should remove the if () tests on them and just call them unconditionally.

Hmm, well in standard-case you are right.  But well there is still a
chance that GetProcAddress returns NULL-pointer ...  I think we should
keep the if-check.

 cheers,
   DaveK


Kai

[PATCH] Random cleanups

2013-04-09 Thread Richard Biener


from my local tree.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-09  Richard Biener  rguent...@suse.de

java/
* expr.c (build_java_binop): Pass a type to build_int_cst.

* tree-ssa-loop-manip.c (rewrite_into_loop_closed_ssa): Avoid
work that is not necessary.

Index: gcc/java/expr.c
===
--- gcc/java/expr.c (revision 197621)
+++ gcc/java/expr.c (working copy)
@@ -1531,7 +1531,7 @@ build_java_binop (enum tree_code op, tre
   }
 case LSHIFT_EXPR:
 case RSHIFT_EXPR:
-  mask = build_int_cst (NULL_TREE,
+  mask = build_int_cst (int_type_node,
TYPE_PRECISION (TREE_TYPE (arg1)) - 1);
   arg2 = fold_build2 (BIT_AND_EXPR, int_type_node, arg2, mask);
   break;
Index: gcc/tree-ssa-loop-manip.c
===
--- gcc/tree-ssa-loop-manip.c   (revision 197621)
+++ gcc/tree-ssa-loop-manip.c   (working copy)
@@ -489,7 +489,6 @@ find_uses_to_rename (bitmap changed_bbs,
 void
 rewrite_into_loop_closed_ssa (bitmap changed_bbs, unsigned update_flag)
 {
-  bitmap *loop_exits;
   bitmap *use_blocks;
   bitmap names_to_rename;
 
@@ -505,11 +504,6 @@ rewrite_into_loop_closed_ssa (bitmap cha
 
   names_to_rename = BITMAP_ALLOC (loop_renamer_obstack);
 
-  /* An array of bitmaps where LOOP_EXITS[I] is the set of basic blocks
- that are the destination of an edge exiting loop number I.  */
-  loop_exits = XNEWVEC (bitmap, number_of_loops ());
-  get_loops_exits (loop_exits);
-
   /* Uses of names to rename.  We don't have to initialize this array,
  because we know that we will only have entries for the SSA names
  in NAMES_TO_RENAME.  */
@@ -518,17 +512,26 @@ rewrite_into_loop_closed_ssa (bitmap cha
   /* Find the uses outside loops.  */
   find_uses_to_rename (changed_bbs, use_blocks, names_to_rename);
 
-  /* Add the PHI nodes on exits of the loops for the names we need to
- rewrite.  */
-  add_exit_phis (names_to_rename, use_blocks, loop_exits);
+  if (!bitmap_empty_p (names_to_rename))
+{
+  /* An array of bitmaps where LOOP_EXITS[I] is the set of basic blocks
+that are the destination of an edge exiting loop number I.  */
+  bitmap *loop_exits = XNEWVEC (bitmap, number_of_loops ());
+  get_loops_exits (loop_exits);
+
+  /* Add the PHI nodes on exits of the loops for the names we need to
+rewrite.  */
+  add_exit_phis (names_to_rename, use_blocks, loop_exits);
+
+  free (loop_exits);
+
+  /* Fix up all the names found to be used outside their original
+loops.  */
+  update_ssa (TODO_update_ssa);
+}
 
   bitmap_obstack_release (loop_renamer_obstack);
   free (use_blocks);
-  free (loop_exits);
-
-  /* Fix up all the names found to be used outside their original
- loops.  */
-  update_ssa (TODO_update_ssa);
 }
 
 /* Check invariants of the loop closed ssa form for the USE in BB.  */

[patch] print SEQUENCE of insns in sched-vis.c

2013-04-09 Thread Steven Bosscher

Hello,

sched-vis.c would print insns in a SEQUENCE as a pattern, in effect
printing only GET_CODE(insn). Fixed with this patch.

Bootstrappedtested on powerpc64-unknown-linux-gnu and
sparc64-unknown-linux-gnu with some dump inspections to make sure
everything looks as expected now.
OK for trunk?

Ciao!
Steven
* sched-vis.c (print_pattern): Print SEQUENCE of insns as insns.

Index: sched-vis.c
===
--- sched-vis.c (revision 197610)
+++ sched-vis.c (working copy)
@@ -51,6 +51,9 @@ along with GCC; see the file COPYING3.  If not see
 static bool rtl_slim_pp_initialized = false;
 static pretty_printer rtl_slim_pp;
 
+/* For insns we print patterns, and for some patterns we print insns...  */
+static void print_insn_with_notes (pretty_printer *, const_rtx);
+
 /* This recognizes rtx'en classified as expressions.  These are always
represent some action on values or results of other expression, that
may be stored in objects representing values.  */
@@ -562,14 +565,32 @@ print_pattern (pretty_printer *pp, const_rtx x, in
   break;
 case SEQUENCE:
   {
-   int i;
-
pp_string (pp, sequence{);
-   for (i = 0; i  XVECLEN (x, 0); i++)
+   if (INSN_P (XVECEXP (x, 0, 0)))
  {
-   print_pattern (pp, XVECEXP (x, 0, i), verbose);
-   pp_character (pp, ';');
+   /* Print the sequence insns indented.  */
+   const char * save_print_rtx_head = print_rtx_head;
+   char indented_print_rtx_head[32];
+
+   pp_newline (pp);
+   gcc_assert (strlen (print_rtx_head)  sizeof 
(indented_print_rtx_head) - 4);
+   snprintf (indented_print_rtx_head,
+ sizeof (indented_print_rtx_head),
+ %s , print_rtx_head);
+   print_rtx_head = indented_print_rtx_head;
+   for (int i = 0; i  XVECLEN (x, 0); i++)
+ print_insn_with_notes (pp, XVECEXP (x, 0, i));
+   pp_printf (pp, %s  , save_print_rtx_head);
+   print_rtx_head = save_print_rtx_head;
  }
+   else
+ {
+   for (int i = 0; i  XVECLEN (x, 0); i++)
+ {
+   print_pattern (pp, XVECEXP (x, 0, i), verbose);
+   pp_character (pp, ';');
+ }
+ }
pp_character (pp, '}');
   }
   break;

Re: [patch] print SEQUENCE of insns in sched-vis.c

2013-04-09 Thread Eric Botcazou

 Bootstrappedtested on powerpc64-unknown-linux-gnu and
 sparc64-unknown-linux-gnu with some dump inspections to make sure
 everything looks as expected now.
 OK for trunk?

Yes, thanks.

-- 
Eric Botcazou

Re: [PATCH] Fix PR48762

2013-04-09 Thread Marek Polacek

On Tue, Apr 09, 2013 at 11:45:04AM +0200, Eric Botcazou wrote:
  Alternatively, we can bump the minimum of that param, as usual ;)
 
 Let's do that and bump it to 1, my understanding is that 0 and 1 are 
 equivalent for this param.

Alright.  So ok to apply this one (trunk/4.8)?

2013-04-09  Marek Polacek  pola...@redhat.com

PR tree-optimization/48762
* params.def (PARAM_MAX_CSE_INSNS): Increase the minimum
value to 1.

--- gcc/params.def.mp   2013-04-09 12:52:07.838278693 +0200
+++ gcc/params.def  2013-04-09 12:52:28.705349278 +0200
@@ -451,7 +451,7 @@ DEFPARAM(PARAM_MAX_GOTO_DUPLICATION_INSN
 DEFPARAM(PARAM_MAX_CSE_PATH_LENGTH,
 max-cse-path-length,
 The maximum length of path considered in cse,
-10, 0, 0)
+10, 1, 0)
 DEFPARAM(PARAM_MAX_CSE_INSNS,
 max-cse-insns,
 The maximum instructions CSE process before flushing,

Marek

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Joern Rennecke


Quoting Richard Biener richard.guent...@gmail.com:


On Mon, Apr 8, 2013 at 5:10 PM, Joern Rennecke
joern.renne...@embecosm.com wrote:

This is basically the same patch as attached to the PR, except that I
have changed the goto-loop into a do-while loop with a new comment;
this caused the need for a lot of reformatting.


Can you please include a testcase that shows the effect of this?


Attached.
FWIW, the assembler is simpler to understand if you add the compilation
option -mfp-mode=round-nearest.
Without the patch, you then get:

_f:
fmul r1,r1,r3
mov r3, %low(#-2147483648)
movt r3, %high(#-2147483648)
eor r1,r3,r1
fmadd r1,r0,r2
mov r0,r1
rts

And with the patch:

_f:
fmul r0,r0,r2
fmsub r0,r1,r3
rts



2013-04-08  Joern Rennecke  joern.renne...@embecosm.com

* tree-ssa-math-opts.c (mult_to_fma_pass): New file static struct.
(convert_mult_to_fma): In first pass, don't use an fms construct
when we don't have an fms operation, but fmna.


it's fnma I believe.


Oops, yes, typo.  The patch itself has the right spelling.
2013-03-09  Joern Rennecke joern.renne...@embecosm.com

* gcc.target/epiphany/fnma-1.c: New test.

Index: gcc.target/epiphany/fnma-1.c
===
--- gcc.target/epiphany/fnma-1.c(revision 0)
+++ gcc.target/epiphany/fnma-1.c(working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+/* { dg-final { scan-assembler-times fmsub\[ \ta-zA-Z0-9\]*, 1 } } */
+
+float
+f (float ar, float ai, float br, float bi)
+{
+  return ar * br - ai * bi;
+}

[PATCH] More vectorizer TLC

2013-04-09 Thread Richard Biener


This gets rid of slp_void_p and moves vect_get_place_in_interleaving_chain
next to its only user.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-09  Richard Biener  rguent...@suse.de

* tree-vectorizer.h (slp_void_p): Remove.
(slp_tree): Typedef before _slp_tree declaration.
(struct _slp_tree): Use a vector of slp_tree as children.
(vect_get_place_in_interleaving_chain): Remove.
* tree-vect-data-refs.c (vect_get_place_in_interleaving_chain):
Move ...
* tree-vect-slp.c (vect_get_place_in_interleaving_chain): ... here
and make static.
(vect_free_slp_tree, vect_print_slp_tree, vect_mark_slp_stmts,
vect_mark_slp_stmts_relevant, vect_slp_rearrange_stmts,
vect_detect_hybrid_slp_stmts, vect_slp_analyze_node_operations,
vect_schedule_slp_instance, vect_remove_slp_scalar_calls):
Use slp_node instead of slp_void_p and adjust.

Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 197621)
--- gcc/tree-vect-data-refs.c   (working copy)
*** vect_get_smallest_scalar_type (gimple st
*** 129,159 
  }
  
  
- /* Find the place of the data-ref in STMT in the interleaving chain that 
starts
-from FIRST_STMT.  Return -1 if the data-ref is not a part of the chain.  */
- 
- int
- vect_get_place_in_interleaving_chain (gimple stmt, gimple first_stmt)
- {
-   gimple next_stmt = first_stmt;
-   int result = 0;
- 
-   if (first_stmt != GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)))
- return -1;
- 
-   while (next_stmt  next_stmt != stmt)
- {
-   result++;
-   next_stmt = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next_stmt));
- }
- 
-   if (next_stmt)
- return result;
-   else
- return -1;
- }
- 
- 
  /* Check if data references pointed by DR_I and DR_J are same or
 belong to same interleaving group.  Return FALSE if drs are
 different, otherwise return TRUE.  */
--- 129,134 
Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 197621)
--- gcc/tree-vect-slp.c (working copy)
*** static void
*** 67,79 
  vect_free_slp_tree (slp_tree node)
  {
int i;
!   slp_void_p child;
  
if (!node)
  return;
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! vect_free_slp_tree ((slp_tree) child);
  
SLP_TREE_CHILDREN (node).release ();
SLP_TREE_SCALAR_STMTS (node).release ();
--- 67,79 
  vect_free_slp_tree (slp_tree node)
  {
int i;
!   slp_tree child;
  
if (!node)
  return;
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! vect_free_slp_tree (child);
  
SLP_TREE_CHILDREN (node).release ();
SLP_TREE_SCALAR_STMTS (node).release ();
*** vect_free_oprnd_info (vecslp_oprnd_info
*** 168,173 
--- 168,198 
  }
  
  
+ /* Find the place of the data-ref in STMT in the interleaving chain that 
starts
+from FIRST_STMT.  Return -1 if the data-ref is not a part of the chain.  */
+ 
+ static int
+ vect_get_place_in_interleaving_chain (gimple stmt, gimple first_stmt)
+ {
+   gimple next_stmt = first_stmt;
+   int result = 0;
+ 
+   if (first_stmt != GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)))
+ return -1;
+ 
+   do
+ {
+   if (next_stmt == stmt)
+   return result;
+   result++;
+   next_stmt = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next_stmt));
+ }
+   while (next_stmt);
+ 
+   return -1;
+ }
+ 
+ 
  /* Get the defs for the rhs of STMT (collect them in OPRNDS_INFO), check that
 they are of a valid type and that they match the defs of the first stmt of
 the SLP group (stored in OPRNDS_INFO).  */
*** vect_print_slp_tree (int dump_kind, slp_
*** 991,997 
  {
int i;
gimple stmt;
!   slp_void_p child;
  
if (!node)
  return;
--- 1016,1022 
  {
int i;
gimple stmt;
!   slp_tree child;
  
if (!node)
  return;
*** vect_print_slp_tree (int dump_kind, slp_
*** 1005,1011 
dump_printf (dump_kind, \n);
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! vect_print_slp_tree (dump_kind, (slp_tree) child);
  }
  
  
--- 1030,1036 
dump_printf (dump_kind, \n);
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! vect_print_slp_tree (dump_kind, child);
  }
  
  
*** vect_mark_slp_stmts (slp_tree node, enum
*** 1019,1025 
  {
int i;
gimple stmt;
!   slp_void_p child;
  
if (!node)
  return;
--- 1044,1050 
  {
int i;
gimple stmt;
!   slp_tree child;
  
if (!node)
  return;
*** vect_mark_slp_stmts (slp_tree node, enum
*** 1029,1035 
STMT_SLP_TYPE (vinfo_for_stmt (stmt)) = mark;
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! vect_mark_slp_stmts ((slp_tree) child, mark, j);
  }

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-09 Thread Robert Dewar


On 4/9/2013 5:39 AM, Florian Weimer wrote:

On 04/09/2013 01:47 AM, Robert Dewar wrote:

Well the back end has all the information to figure this out I think!
But anyway, for Ada, the current situation is just fine, and has
the advantage that the -gnatG expanded code listing clearly shows in
Ada source form, what is going on.


Isn't this a bit optimistic, considering that run-time overflow checking
currently does not use existing hardware support?


Not clear what you mean here, we don't rely on the back end for run-time
overflow checking. What is over-optimistic here?

BTW, existing hardware support can be a dubious thing, you have
to be careful to evaluate costs, for instance you don't want to
use INTO on modern x86 targets!

Re: [PATCH] Fix PR48762

2013-04-09 Thread Jeff Law


On 04/09/2013 05:13 AM, Marek Polacek wrote:

On Tue, Apr 09, 2013 at 11:45:04AM +0200, Eric Botcazou wrote:

Alternatively, we can bump the minimum of that param, as usual ;)


Let's do that and bump it to 1, my understanding is that 0 and 1 are
equivalent for this param.


Alright.  So ok to apply this one (trunk/4.8)?

2013-04-09  Marek Polacek  pola...@redhat.com

PR tree-optimization/48762
* params.def (PARAM_MAX_CSE_INSNS): Increase the minimum
value to 1.

OK.
Jeff

Re: [PATCH][Backport 4.7][ARM] Fix PR 56720

2013-04-09 Thread Richard Earnshaw


On 05/04/13 16:35, Kyrylo Tkachov wrote:

Hi all,

This patch is a backport of the fix for PR 56720 where we would ICE on
arm-*-* when trying to
expand vcond with a floating point unorderd comparison cases. The patch is
almost identical
to the trunk patch at:
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00652.html
except that it adds explicit handling of the LTGT case which does not
require explicit
handling in 4.8 and onwards.

PR 56720 itself has been fixed on trunk and 4.8 but is present in 4.7 and
4.6

Ok for 4.7 now or when it reopens?



Is this a regression?  If so, from what?

My feeling is that unless this has fixed a regression, it's sufficiently 
complex that it's not really appropriate to back-port it to a stable branch.


R.


Tested arm-none-eabi on qemu.

Thanks,
Kyrill

gcc/ChangeLog
2013-04-05  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR target/56720
* config/arm/iterators.md (v_cmp_result): New mode attribute.
* config/arm/neon.md (vcondmodemode): Handle unordered cases.


gcc/testsuite/ChangeLog
2013-04-05  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR target/56720
* gcc.target/arm/neon-vcond-gt.c: New test.
* gcc.target/arm/neon-vcond-ltgt.c: Likewise.
* gcc.target/arm/neon-vcond-unordered.c: Likewise.=


neon-vcond-4.7.patch

[PATCH] color diagnostics markers

2013-04-09 Thread Jakub Jelinek

On Mon, Apr 08, 2013 at 03:23:01PM +0200, Jakub Jelinek wrote:
 Anyway, I've kept the default as never for now.

Here is an updated patch with %r/%R, fully bootstrapped/regtested now
on x86_64-linux and i686-linux, tested on a couple of small testcases
with -fdiagnostics-color{,=never,=auto,=always}, without GCC_COLORS
in environment, with GCC_COLORS= or with
GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01;33:quote=01;34'
and both on my default gnome-terminal settings (Linux console colors with
white on black) and on default configured xterm (i.e. xterm colors, black on
white).  The default is still -fdiagnostics-color=never, can be changed
later on.

Ok for trunk?

2013-04-09  Manuel López-Ibáñez  m...@gcc.gnu.org
Jakub Jelinek  ja...@redhat.com

* opts.c: Include diagnostic-color.h.
(common_handle_option): Handle OPT_fdiagnostics_color_.
* Makefile.in (OBJS-libcommon): Add diagnostic-color.o.
(diagnostic.o, opts.o, pretty-print.o): Depend on diagnostic-color.h.
(diagnostic-color.o): New.
* common.opt (fdiagnostics-color, fdiagnostics-color=): New options.
(diagnostic_color_rule): New enum.
* dwarf2out.c (gen_producer_string): Don't print -fdiagnostics-color*.
* langhooks.c (lhd_print_error_function): Add %r locus and %R around
the location string.
* diagnostic.def: Add 3rd argument to DEFINE_DIAGNOSTIC_KIND macros,
either NULL, or color kind.
* diagnostic-color.c: New file.
* diagnostic-color.h: New file.
* diagnostic-core.h (DEFINE_DIAGNOSTIC_KIND): Adjust macro for 3
arguments.
* doc/invoke.texi (-fdiagnostics-color): Document.
* pretty-print.h (pp_show_color): Define.
(struct pretty_print_info): Add show_color field.
* diagnostic.c: Include diagnostic-color.h.
(diagnostic_build_prefix): Adjust for 3 argument DEFINE_DIAGNOSTIC_KIND
macros.  Colorize error:, warning: etc. strings and also the location
string.
(diagnostic_show_locus): Colorize the caret line.
* pretty-print.c: Include diagnostic-color.h.
(pp_base_format): Handle %r and %R format specifiers.  Colorize strings
inside of % % quotes or quoted through q format modifier.
c-family/
* c-format.c (gcc_diag_char_table, gcc_tdiag_char_table,
gcc_cdiag_char_table, gcc_cxxdiag_char_table): Add %r and %R format
specifiers.
cp/
* error.c (cp_print_error_function,
print_instantiation_partial_context_line,
maybe_print_constexpr_context): Colorize locus strings.

--- gcc/opts.c.jj   2013-04-09 08:34:52.577866059 +0200
+++ gcc/opts.c  2013-04-09 08:35:02.879844677 +0200
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
 #include flags.h
 #include params.h
 #include diagnostic.h
+#include diagnostic-color.h
 #include opts-diagnostic.h
 #include insn-attr-common.h
 #include common/common-target.h
@@ -1497,6 +1498,11 @@ common_handle_option (struct gcc_options
   dc-show_caret = value;
   break;
 
+case OPT_fdiagnostics_color_:
+  pp_show_color (dc-printer)
+   = colorize_init ((diagnostic_color_rule_t) value);
+  break;
+
 case OPT_fdiagnostics_show_option:
   dc-show_option_requested = value;
   break;
--- gcc/Makefile.in.jj  2013-04-09 08:34:52.554866101 +0200
+++ gcc/Makefile.in 2013-04-09 08:35:02.881844586 +0200
@@ -1465,7 +1465,7 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o 
version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
@@ -2668,11 +2668,12 @@ fold-const.o : fold-const.c $(CONFIG_H)
$(GIMPLE_H) realmpfr.h $(TREE_FLOW_H)
 diagnostic.o : diagnostic.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
version.h $(DEMANGLE_H) $(INPUT_H) intl.h $(BACKTRACE_H) $(DIAGNOSTIC_H) \
-   diagnostic.def
+   diagnostic.def diagnostic-color.h
+diagnostic-color.o : diagnostic-color.c $(CONFIG_H) $(SYSTEM_H) 
diagnostic-color.h
 opts.o : opts.c $(OPTS_H) $(OPTIONS_H) $(DIAGNOSTIC_CORE_H) $(CONFIG_H) 
$(SYSTEM_H) \
coretypes.h $(DUMPFILE_H) $(TM_H) \
$(DIAGNOSTIC_H) insn-attr-common.h intl.h $(COMMON_TARGET_H) \
-   $(FLAGS_H) $(PARAMS_H) opts-diagnostic.h
+   $(FLAGS_H) $(PARAMS_H) opts-diagnostic.h diagnostic-color.h
 opts-global.o : opts-global.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(DIAGNOSTIC_H) $(OPTS_H) $(FLAGS_H) $(GGC_H) $(TREE_H) langhooks.h \
$(TM_H) $(RTL_H) $(DBGCNT_H) debug.h $(LTO_STREAMER_H) output.h \
@@ -3434,7 +3435,8 @@ params.o : params.c $(CONFIG_H) $(SYSTEM
$(PARAMS_H) $(DIAGNOSTIC_CORE_H)
 pointer-set.o: pointer-set.c pointer-set.h $(CONFIG_H) $(SYSTEM_H)
 hooks.o: hooks.c $(CONFIG_H)

RE: [PATCH][Backport 4.7][ARM] Fix PR 56720

2013-04-09 Thread Kyrylo Tkachov

 -Original Message-
 From: Richard Earnshaw
 Sent: 09 April 2013 14:12
 To: Kyrylo Tkachov
 Cc: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan; 'Jakub Jelinek'
 Subject: Re: [PATCH][Backport 4.7][ARM] Fix PR 56720

 On 05/04/13 16:35, Kyrylo Tkachov wrote:
  Hi all,

  This patch is a backport of the fix for PR 56720 where we would ICE
 on
  arm-*-* when trying to
  expand vcond with a floating point unorderd comparison cases. The
 patch is
  almost identical
  to the trunk patch at:
  http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00652.html
  except that it adds explicit handling of the LTGT case which does not
  require explicit
  handling in 4.8 and onwards.

  PR 56720 itself has been fixed on trunk and 4.8 but is present in 4.7
 and
  4.6

  Ok for 4.7 now or when it reopens?

 Is this a regression?  If so, from what?

I've been able to reproduce it in 4.6 and onwards but have not tried any
earlier releases.

 My feeling is that unless this has fixed a regression, it's
 sufficiently
 complex that it's not really appropriate to back-port it to a stable
 branch.

Fair enough, I don't think there's any urgency to get this fixed on earlier
releases.
We can drop this backport then.

Thanks,
Kyrill

 R.

  Tested arm-none-eabi on qemu.

  Thanks,
  Kyrill

  gcc/ChangeLog
  2013-04-05  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/56720
  * config/arm/iterators.md (v_cmp_result): New mode attribute.
  * config/arm/neon.md (vcondmodemode): Handle unordered cases.

  gcc/testsuite/ChangeLog
  2013-04-05  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/56720
  * gcc.target/arm/neon-vcond-gt.c: New test.
  * gcc.target/arm/neon-vcond-ltgt.c: Likewise.
  * gcc.target/arm/neon-vcond-unordered.c: Likewise.=

  neon-vcond-4.7.patch

[PATCH] Fix omp loop ICEs if iterators are addressable

2013-04-09 Thread Jakub Jelinek

Hi!

The omp for loop iterator can be addressable e.g. because of shared use of
it in a 2x nested parallel (in which case we can't use copy-in/out).

The following patch tweaks expand_omp_for* to allow the iterators to be
addressable.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
Will backport to 4.8 soon (and to 4.7 after 4.7.3 is released).

2013-04-09  Jakub Jelinek  ja...@redhat.com

PR middle-end/56883
* omp-low.c (expand_omp_for_generic, expand_omp_for_static_nochunk,
expand_omp_for_static_chunk): Use simple_p = true in
force_gimple_operand_gsi calls when assigning to addressable decls.

* c-c++-common/gomp/pr56883.c: New test.

--- gcc/omp-low.c.jj2013-02-07 08:59:48.0 +0100
+++ gcc/omp-low.c   2013-04-09 11:09:42.197181925 +0200
@@ -3920,8 +3920,10 @@ expand_omp_for_generic (struct omp_regio
   if (POINTER_TYPE_P (type))
 t = fold_convert (signed_type_for (type), t);
   t = fold_convert (type, t);
-  t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
-   false, GSI_CONTINUE_LINKING);
+  t = force_gimple_operand_gsi (gsi, t,
+   DECL_P (fd-loop.v)
+TREE_ADDRESSABLE (fd-loop.v),
+   NULL_TREE, false, GSI_CONTINUE_LINKING);
   stmt = gimple_build_assign (fd-loop.v, t);
   gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
 
@@ -3952,8 +3954,11 @@ expand_omp_for_generic (struct omp_regio
t = fold_build_pointer_plus (fd-loops[i].n1, t);
  else
t = fold_build2 (PLUS_EXPR, itype, fd-loops[i].n1, t);
- t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
-   false, GSI_CONTINUE_LINKING);
+ t = force_gimple_operand_gsi (gsi, t,
+   DECL_P (fd-loops[i].v)
+TREE_ADDRESSABLE (fd-loops[i].v),
+   NULL_TREE, false,
+   GSI_CONTINUE_LINKING);
  stmt = gimple_build_assign (fd-loops[i].v, t);
  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
  if (i != 0)
@@ -3981,12 +3986,15 @@ expand_omp_for_generic (struct omp_regio
t = fold_build_pointer_plus (vmain, fd-loop.step);
   else
t = fold_build2 (PLUS_EXPR, type, vmain, fd-loop.step);
-  t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
-   true, GSI_SAME_STMT);
+  t = force_gimple_operand_gsi (gsi, t,
+   DECL_P (vback)  TREE_ADDRESSABLE (vback),
+   NULL_TREE, true, GSI_SAME_STMT);
   stmt = gimple_build_assign (vback, t);
   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
 
-  t = build2 (fd-loop.cond_code, boolean_type_node, vback, iend);
+  t = build2 (fd-loop.cond_code, boolean_type_node,
+ DECL_P (vback)  TREE_ADDRESSABLE (vback) ? t : vback,
+ iend);
   stmt = gimple_build_cond_empty (t);
   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
 
@@ -4011,8 +4019,12 @@ expand_omp_for_generic (struct omp_regio
  e-probability = REG_BR_PROB_BASE / 8;
 
  t = fd-loops[i + 1].n1;
- t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
-   false, GSI_CONTINUE_LINKING);
+ t = force_gimple_operand_gsi (gsi, t,
+   DECL_P (fd-loops[i + 1].v)
+TREE_ADDRESSABLE
+   (fd-loops[i + 1].v),
+   NULL_TREE, false,
+   GSI_CONTINUE_LINKING);
  stmt = gimple_build_assign (fd-loops[i + 1].v, t);
  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
}
@@ -4026,8 +4038,11 @@ expand_omp_for_generic (struct omp_regio
  else
t = fold_build2 (PLUS_EXPR, vtype, fd-loops[i].v,
 fd-loops[i].step);
- t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
-   false, GSI_CONTINUE_LINKING);
+ t = force_gimple_operand_gsi (gsi, t,
+   DECL_P (fd-loops[i].v)
+TREE_ADDRESSABLE 
(fd-loops[i].v),
+   NULL_TREE, false,
+   GSI_CONTINUE_LINKING);
  stmt = gimple_build_assign (fd-loops[i].v, t);
  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
 
@@ -4036,8 +4051,12 @@ expand_omp_for_generic (struct omp_regio
  t = fd-loops[i].n2;
  t =

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 1:25 PM, Joern Rennecke
joern.renne...@embecosm.com wrote:
 Quoting Richard Biener richard.guent...@gmail.com:

 On Mon, Apr 8, 2013 at 5:10 PM, Joern Rennecke
 joern.renne...@embecosm.com wrote:

 This is basically the same patch as attached to the PR, except that I
 have changed the goto-loop into a do-while loop with a new comment;
 this caused the need for a lot of reformatting.


 Can you please include a testcase that shows the effect of this?


 Attached.
 FWIW, the assembler is simpler to understand if you add the compilation
 option -mfp-mode=round-nearest.
 Without the patch, you then get:

 _f:
 fmul r1,r1,r3
 mov r3, %low(#-2147483648)
 movt r3, %high(#-2147483648)
 eor r1,r3,r1
 fmadd r1,r0,r2
 mov r0,r1
 rts

 And with the patch:

 _f:
 fmul r0,r0,r2
 fmsub r0,r1,r3
 rts

This means the iteration doesn't help here.



 2013-04-08  Joern Rennecke  joern.renne...@embecosm.com

 * tree-ssa-math-opts.c (mult_to_fma_pass): New file static
 struct.
 (convert_mult_to_fma): In first pass, don't use an fms construct
 when we don't have an fms operation, but fmna.


 it's fnma I believe.


 Oops, yes, typo.  The patch itself has the right spelling.

 2013-03-09  Joern Rennecke joern.renne...@embecosm.com

 * gcc.target/epiphany/fnma-1.c: New test.

 Index: gcc.target/epiphany/fnma-1.c
 ===
 --- gcc.target/epiphany/fnma-1.c(revision 0)
 +++ gcc.target/epiphany/fnma-1.c(working copy)
 @@ -0,0 +1,9 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 } */
 +/* { dg-final { scan-assembler-times fmsub\[ \ta-zA-Z0-9\]*, 1 } } */
 +
 +float
 +f (float ar, float ai, float br, float bi)
 +{
 +  return ar * br - ai * bi;
 +}

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-09 Thread Florian Weimer


On 04/09/2013 02:41 PM, Robert Dewar wrote:

On 4/9/2013 5:39 AM, Florian Weimer wrote:

On 04/09/2013 01:47 AM, Robert Dewar wrote:

Well the back end has all the information to figure this out I think!
But anyway, for Ada, the current situation is just fine, and has
the advantage that the -gnatG expanded code listing clearly shows in
Ada source form, what is going on.


Isn't this a bit optimistic, considering that run-time overflow checking
currently does not use existing hardware support?


Not clear what you mean here, we don't rely on the back end for run-time
overflow checking. What is over-optimistic here?

BTW, existing hardware support can be a dubious thing, you have
to be careful to evaluate costs, for instance you don't want to
use INTO on modern x86 targets!


It's not aobut INTO, just access to the overflow flag.

A simple function which adds its two Integer arguments compiles to this 
machine code (with -gnato -O2):


.cfi_startproc
movslq  %esi, %rax
movslq  %edi, %rdx
addq%rax, %rdx
movl$2147483648, %eax
addq%rax, %rdx
movl$4294967295, %eax
cmpq%rax, %rdx
ja  .L5
leal(%rsi,%rdi), %eax
ret
.L5:
pushq   %rax
.cfi_def_cfa_offset 16
movl$3, %esi
movl$.LC0, %edi
xorl%eax, %eax
call__gnat_rcheck_10
.cfi_endproc

While it could be like this:

.cfi_startproc
movl%edi, %eax
addl%esi, %eax
jo  .L5
ret
.L5:
pushq   %rax
.cfi_def_cfa_offset 16
movl$3, %esi
movl$.LC0, %edi
xorl%eax, %eax
call__gnat_rcheck_10
.cfi_endproc

Admittedly, I haven't benchmarked it, but at least from the code size 
aspect, it's a significant improvement.


--
Florian Weimer / Red Hat Product Security Team

[PATCH] Vectorizer TLC, split SLP discovery and cost calculation

2013-04-09 Thread Richard Biener


This splits discovering of a SLP opportunity (vect_build_slp_tree)
and calculating the cost of a SLP instance (now vect_analyze_slp_cost).
The immediate need for me is to support handling of mismatches
in the SLP tree due to commutative operand order mismatch, one more
cleanup patch for this reason will follow.  In the end cost calculation
should be delayed even more, but that's for later.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-09  Richard Biener  rguent...@suse.de

* tree-vect-slp.c (vect_get_and_check_slp_defs): Remove code
dealing with cost.
(vect_build_slp_tree): Likewise.
(vect_analyze_slp_cost_1, vect_analyze_slp_cost): New functions
calculating the cost of a SLP instance.
(vect_analyze_slp_instance): Use it from here, after building
the SLP tree.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 197629)
+++ gcc/tree-vect-slp.c (working copy)
@@ -199,11 +199,8 @@ vect_get_place_in_interleaving_chain (gi
 
 static bool
 vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
- slp_tree slp_node, gimple stmt,
-int ncopies_for_cost, bool first,
- vecslp_oprnd_info *oprnds_info,
-stmt_vector_for_cost *prologue_cost_vec,
-stmt_vector_for_cost *body_cost_vec)
+ gimple stmt, bool first,
+ vecslp_oprnd_info *oprnds_info)
 {
   tree oprnd;
   unsigned int i, number_of_oprnds;
@@ -211,8 +208,6 @@ vect_get_and_check_slp_defs (loop_vec_in
   gimple def_stmt;
   enum vect_def_type dt = vect_uninitialized_def;
   enum vect_def_type dt_op0 = vect_uninitialized_def;
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  tree lhs = gimple_get_lhs (stmt);
   struct loop *loop = NULL;
   enum tree_code rhs_code;
   bool different_types = false;
@@ -344,22 +339,6 @@ vect_get_and_check_slp_defs (loop_vec_in
{
  def_op0 = def;
  dt_op0 = dt;
- /* Analyze costs (for the first stmt of the group only).  */
- if (REFERENCE_CLASS_P (lhs))
-   /* Store.  */
-vect_model_store_cost (stmt_info, ncopies_for_cost, false,
-  dt, slp_node, prologue_cost_vec,
-  body_cost_vec);
- else
-   {
- enum vect_def_type dts[2];
- dts[0] = dt;
- dts[1] = vect_uninitialized_def;
- /* Not memory operation (we don't call this function for
-loads).  */
- vect_model_simple_cost (stmt_info, ncopies_for_cost, dts,
- prologue_cost_vec, body_cost_vec);
-   }
}
}
   else
@@ -479,13 +458,11 @@ vect_get_and_check_slp_defs (loop_vec_in
 
 static bool
 vect_build_slp_tree (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
- slp_tree *node, unsigned int group_size, int 
*outside_cost,
- int ncopies_for_cost, unsigned int *max_nunits,
+ slp_tree *node, unsigned int group_size,
+ unsigned int *max_nunits,
  vecint *load_permutation,
  vecslp_tree *loads,
- unsigned int vectorization_factor, bool *loads_permuted,
-stmt_vector_for_cost *prologue_cost_vec,
-stmt_vector_for_cost *body_cost_vec)
+ unsigned int vectorization_factor, bool *loads_permuted)
 {
   unsigned int i;
   vecgimple stmts = SLP_TREE_SCALAR_STMTS (*node);
@@ -750,11 +727,8 @@ vect_build_slp_tree (loop_vec_info loop_
  if (REFERENCE_CLASS_P (lhs))
{
  /* Store.  */
- if (!vect_get_and_check_slp_defs (loop_vinfo, bb_vinfo, *node,
-   stmt, ncopies_for_cost,
-   (i == 0), oprnds_info,
-   prologue_cost_vec,
-   body_cost_vec))
+ if (!vect_get_and_check_slp_defs (loop_vinfo, bb_vinfo,
+   stmt, (i == 0), oprnds_info))
{
  vect_free_oprnd_info (oprnds_info);
  return false;
@@ -863,11 +837,6 @@ vect_build_slp_tree (loop_vec_info loop_
  vect_free_oprnd_info (oprnds_info);
   return false;
 }
-
-  /* Analyze costs (for the first stmt in the group).  */
-  vect_model_load_cost (vinfo_for_stmt (stmt),
-ncopies_for_cost,

[PATCH SH] Error: unaligned opcodes detected in executable segment

2013-04-09 Thread Christian Bruel

Hello,

This patch fixes label alignments after a ADDR_DIFF_VEC with byte
offsets. The bug occurs with building the libgcc for a sh-elf target.

See a reduced case here. The funny thing is that the assembly error
given in Subject appears only on a debug section when compiled with -g,
thus the emitted code without -g:

 .align 2
.L4:
.byte   .L3-.L5
.byte   .L10-.L5
.byte   .L7-.L5
.byte   .L8-.L5
.byte   .L9-.L5
.L3:
 mov.l   .L13,r1

assembles silently, although unaligned...

OK for trunk ? The failure occurs with the libgcc, so tested by allowing
the sh-elf build to complete.

Thanks,

Christian


2013-04-09  Christian Bruel  christian.br...@st.com

	* config/sh/sh.md (barrier_align): Use next/prev_active_insn instead
	of next/prev_real_insn.

Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 197633)
+++ gcc/config/sh/sh.c	(working copy)
@@ -5842,7 +5842,7 @@ fixup_addr_diff_vecs (rtx first)
 int
 barrier_align (rtx barrier_or_label)
 {
-  rtx next = next_real_insn (barrier_or_label), pat, prev;
+  rtx next = next_active_insn (barrier_or_label), pat, prev;
 
   if (! next)
 return 0;
@@ -5856,7 +5856,7 @@ barrier_align (rtx barrier_or_label)
 /* This is a barrier in front of a constant table.  */
 return 0;
 
-  prev = prev_real_insn (barrier_or_label);
+  prev = prev_active_insn (barrier_or_label);
   if (GET_CODE (PATTERN (prev)) == ADDR_DIFF_VEC)
 {
   pat = PATTERN (prev);
int a;

int
foo (int sw)
{
  int r=0;

  switch (sw)
{
case 1:
  r=a;
  break;
case 2:
  r=2;
  break;
case 3:
  r=3;
  break;
case 4:
  r=4;
  break;
case 5:
  r=5;
  break;
}

  return r;

}

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Joern Rennecke


Quoting Richard Biener richard.guent...@gmail.com:

Oops, I missed your final comment when I wrote my first reply.


I fail to see why you need two passes for this rather than considering the
case that the immediate use stmt of the multiplication we start from
combines another multiplication with a MINUS_EXPR.  That is ...


+  if (use_code == MINUS_EXPR  !negate_p
+  gimple_assign_rhs1 (use_stmt) == result
+  optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
+  optab_handler (fnma_optab, TYPE_MODE (type)) !=
CODE_FOR_nothing
+  mult_to_fma_pass.second_pass == false)


see if that is the case here and simply not do the transform.


In other words, I would have to take gimple_assign_rh2 of use_stmt, then
find the statement that sets that value, check if this is a multiply,
and that it is not expanded as a widening multiply (thus duplicating most of
the code in execute_optimize_widening_mul), and also check that if there are
any other uses of the multiply result, that these are likewise reduced to
fma somehow.  At which point we hit recursion.  This could easily double
the size of this source file.  Also, we are alternating between forward
and backward searches, which could lead to quadratic complexity where
we had linear otherwise.


A second pass
will not recover from that without destroying the fnma pattern (testcase?)


The second pass is there to generate the fms where we didn't generate fnma.
If fnma (as negate (fma)) has been generated, it will use up the MINUS_EXPR,
so that fms cannot be formed anymore.
This arrangement is a lot simpler than going for a depth-first decision.

Re: [PATCH] color diagnostics markers

2013-04-09 Thread Jason Merrill


Looks good to me if nobody else has comments.

Jason

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Richard Biener

On Tue, Apr 9, 2013 at 4:53 PM, Joern Rennecke
joern.renne...@embecosm.com wrote:
 Quoting Richard Biener richard.guent...@gmail.com:

 Oops, I missed your final comment when I wrote my first reply.


 I fail to see why you need two passes for this rather than considering the
 case that the immediate use stmt of the multiplication we start from
 combines another multiplication with a MINUS_EXPR.  That is ...

 +  if (use_code == MINUS_EXPR  !negate_p
 +  gimple_assign_rhs1 (use_stmt) == result
 +  optab_handler (fms_optab, TYPE_MODE (type)) ==
 CODE_FOR_nothing
 +  optab_handler (fnma_optab, TYPE_MODE (type)) !=
 CODE_FOR_nothing
 +  mult_to_fma_pass.second_pass == false)


 see if that is the case here and simply not do the transform.


 In other words, I would have to take gimple_assign_rh2 of use_stmt, then
 find the statement that sets that value, check if this is a multiply,
 and that it is not expanded as a widening multiply (thus duplicating most of
 the code in execute_optimize_widening_mul), and also check that if there are
 any other uses of the multiply result, that these are likewise reduced to
 fma somehow.  At which point we hit recursion.  This could easily double
 the size of this source file.  Also, we are alternating between forward
 and backward searches, which could lead to quadratic complexity where
 we had linear otherwise.

I don't see that.  It's merely a complication of optimal handling of
a * b +- c * d vs. just a * b +- c.  The pass does simple pattern matching
only, not doing a global optimal transform, so adding another special-case
is reasonable.  Special-casing just for single-use 2nd multiplication
simplifies the cases for example.

Richard.

 A second pass
 will not recover from that without destroying the fnma pattern (testcase?)


 The second pass is there to generate the fms where we didn't generate fnma.
 If fnma (as negate (fma)) has been generated, it will use up the MINUS_EXPR,
 so that fms cannot be formed anymore.
 This arrangement is a lot simpler than going for a depth-first decision.

Re: [PATCH v3]IPA: fixing inline fail report caused by overwritable functions.

2013-04-09 Thread Zhouyi Zhou

Hi Richard,
I do not have write access to GCC SVN repository, can you commit it for me?

Thanks alot

Cheers


On Tue, Apr 9, 2013 at 5:04 PM, Richard Biener
richard.guent...@gmail.com wrote:

 On Tue, Apr 9, 2013 at 5:40 AM, Zhouyi Zhou zhouzho...@gmail.com wrote:
  On Mon, Apr 8, 2013 at 5:48 PM, Richard Biener richard.guent...@gmail.com 
  wrote:
 Can you trigger this message to show up with -Winline before/after the 
 patch?
 Can you please add a testcase then?
  Thanks Richard for reviewing, from my point of view about gcc and my 
  invoking of gcc, -Winline
  only works on callees that be declared inline, but if the callee is 
  declared
  inline, it will be AVAIL_AVAILABLE in function can_inline_edge_p, thus 
  out of the
  range of my patch.

 Ah, indeed ...

  So I only add a testcase for fixing the tree dump, are there any thing more 
  I can do?

 No.  Patch is ok.

 Thanks,
 Richard.

  Regtested/bootstrapped on x86_64-linux
 
  ChangeLog:
  2013-04-08 Zhouyi Zhou yizhouz...@ict.ac.cn
 * cif-code.def (OVERWRITABLE): correct the comment for 
  overwritable
  function
 * ipa-inline.c (can_inline_edge_p): let dump mechanism report 
  the inline
 fail caused by overwritable functions.
 * gcc.dg/tree-ssa/inline-11.c: New test
 
  Index: gcc/cif-code.def
  ===
  --- gcc/cif-code.def(revision 197549)
  +++ gcc/cif-code.def(working copy)
  @@ -48,7 +48,7 @@ DEFCIFCODE(REDEFINED_EXTERN_INLINE,
   /* Function is not inlinable.  */
   DEFCIFCODE(FUNCTION_NOT_INLINABLE, N_(function not inlinable))
 
  -/* Function is not overwritable.  */
  +/* Function is overwritable.  */
   DEFCIFCODE(OVERWRITABLE, N_(function body can be overwritten at link 
  time))
 
   /* Function is not an inlining candidate.  */
  Index: gcc/testsuite/gcc.dg/tree-ssa/inline-11.c
  ===
  --- gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (revision 0)
  +++ gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (working copy)
  @@ -0,0 +1,13 @@
  +/* { dg-do compile } */
  +/* { dg-options -O2 -fdump-tree-einline } */
  +int w;
  +int bar (void) __attribute__ ((weak));
  +int bar (){
  +  w++;
  +}
  +void foo()
  +{
  +  bar();
  +}
  +/* { dg-final { scan-tree-dump-times function body can be overwritten at 
  link time 1 einline } } */
  +/* { dg-final { cleanup-tree-dump einline } } */
  Index: gcc/ipa-inline.c
  ===
  --- gcc/ipa-inline.c(revision 197549)
  +++ gcc/ipa-inline.c(working copy)
  @@ -266,7 +266,7 @@ can_inline_edge_p (struct cgraph_edge *e
 else if (avail = AVAIL_OVERWRITABLE)
   {
 e-inline_failed = CIF_OVERWRITABLE;
  -  return false;
  +  inlinable = false;
   }
 else if (e-call_stmt_cannot_inline_p)
   {

Re: useless cast blocking some optimization in gcc 4.7.3

2013-04-09 Thread Laurent Alfonsi


I understand, Thanks for your answer.
Looking at the standard, I was thinking the example you pointed was 
undefined.
I have created a bugzilla 
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56894) to track this big 
performance regression.


My investigations pointed on the scev_const_prop optimization.
The scalar evolution of result variable in the loop cannot be computed 
because of the cast :


type integer_type 0xf7de32a0 char sizes-gimplified public string-flag
type_6 QI[...]
arg 0 polynomial_chrec 0xf7ecd180
type integer_type 0xf7de33c0 int sizes-gimplified public type_6
SI[...]
arg 0 integer_cst 0xf7dcbe70 constant 2
arg 1 integer_cst 0xf7dcba80 constant 0
arg 2 integer_cst 0xf7dcba9c constant 1

I don't know how to fix the case. But I believe it should be looked at, 
as it is a quite big perf regression in some testcases.

please advice.

Also, I didn't find the bugzilla that led to this change of the += 
operation. I would be interested to have a look at it, if you can find it.


Thanks
Laurent

On 09.04.2013 10:50, Richard Biener wrote:

On Mon, Apr 8, 2013 at 9:13 PM, Laurent Alfonsi laurent.alfo...@st.com wrote:

 Hello,

I have identified a big performance regression between 4.6 and 4.7. (I have
enclosed a pathological test).

After investigation, it is because of the += statement applied on 2 signed
chars.
   - It is now type-promoted to int when it is written result += foo().
(since 4.7)
   - it is type promoted to unsigned char when it is written result =
result + foo().

The char-int-char cast is blocking some optimizations in later phases.
Anyway, this doesn't look wrong, so I extended fold optimization in order to
catch this case. (patch enclosed)
The patch basically transforms :
 (TypeA)  ( (TypeB) a1 + (TypeB) a2 )/* with a1 and a2 of
the signed type TypeA */
into :
 a1 + a2

I believe this is legal for any licit a1/a2 input values (no overflow on
signed char).
No new failure on the two tested targets : sh-superh-elf and
x86_64-unknown-linux-gnu.
Should I enter a bugzilla to track this ? Is it ok for trunk ?

Please open a bugzilla.  No, the patch is not ok, as said by Marc.
It's possible to shorten the operation by performing it using
unsigned arithmetic, that is (signed)((unsigned)a1 + (unsigned)a2).
But if really casts cause the issue you are seeing that will not
help you.

Richard.


2013-04-08  Laurent Alfonsi  laurent.alfo...@st.com

* fold-const.c (fold_unary_loc): Suppress useless type promotion.


Thanks,
Laurent


.

Re: [PATCH][ARM] minmax_arithsi for non-canonical operand order with MINUS operator

2013-04-09 Thread Richard Earnshaw

On 02/04/13 17:06, Kyrylo Tkachov wrote:

From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
Sent: 02 April 2013 11:10
To: Kyrylo Tkachov
Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
Subject: Re: [PATCH][ARM] minmax_arithsi for non-canonical operand
order with MINUS operator

On Thu, Mar 21, 2013 at 6:09 PM, Kyrylo Tkachov
kyrylo.tkac...@arm.com wrote:

Hi all,

This patch adds a splitter variant of the minmax_arithsi pattern for

when

the operator
is non-commutative (MINUS) and the ordering of the operands is not
canonical.

That is, it will trigger for:
#define MAX(a, b) (a  b ? a : b)
int
foo (int a, int b, int c)
{
   return c - MAX (a,b);
}

and will generate:
  cmp r1, r0
  rsbge   r0, r1, r2
  rsblt   r0, r0, r2

instead of the current:
 cmp r0, r1
 movlt   r0, r1
 rsb r0, r0, r2

No regressions on arm-none-eabi.

Ok for trunk?

Split after reload please into cond-exec or use if_then_else instead
if you are splitting before reload - I originally thought it to be
safe when you asked me, but then have gone back and corrected myself.

Read this thread . http://patches.linaro.org/6469/ .

Hi Ramana, thanks for the review.
How about this? We split after reload now.
Using if_then_else got me a lot of unrecognisable instruction ICEs and
delaying the split till after reload seemed like a cleaner approach.

Tested arm-none-eabi on qemu.

gcc/
2013-04-02  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.md (minmax_arithsi_non_canon): New pattern.

gcc/testsuite
2013-04-02  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/arm/minmax_minus.c: New test.

OK.

R.

Re: [PATCH,ARM][6/n] Split min and max patterns

2013-04-09 Thread Richard Earnshaw


On 18/02/13 18:44, Greta Yorsh wrote:

Convert define_insn into define_insn_and_split for various min and max
patterns that output multiple assembly instructions. Use movsicc to emit
RTL. A separate patch will split movsicc.

gcc/

2013-02-14  Greta Yorsh  greta.yo...@arm.com

 * config/arm/arm.md (arm_smax_insn): Convert define_insn into
   define_insn_and_split.
   (arm_smin_insn,arm_umaxsi3,arm_uminsi3): Likewise.




OK.

R.

Re: [PATCH,ARM][3/n] Split various patterns

2013-04-09 Thread Richard Earnshaw


On 18/02/13 18:38, Greta Yorsh wrote:

Convert define_insn into define_insn_and_split for various patterns that
output multiple assembly instructions.

It appears that preparation statements in define_insn_and_split sometimes
are called with which_alternative set to -1 even after reload. Therefore,
preparation statements use conditions on the operands instead of
which_alternative.

gcc/

2013-02-14  Greta Yorsh  greta.yo...@arm.com

   * config/arm/arm.md (andsi_iorsi3_notsi): Convert define_insn into
   define_insn_and_split.
   (arm_negdi2,arm_abssi2,arm_neg_abssi2): Likewise.
   (arm_cmpdi_insn,arm_cmpdi_unsigned): Likewise.



OK.

R.

Re: Fill more delay slots in conditional returns

2013-04-09 Thread Eric Botcazou

 Ah, OK, yea, it makes perfect sense now.  Approved.

Thanks.

 If you wanted to get ambitious, rewriting reorg.c to compute and use
 dependency links to drive its candidate selection instead of the insane
 tracking code it uses would be a huge step forward, both in terms of
 cleanliness.  It'd also help compile-time performance; we do a lot of
 work to track resources that would be totally unnecessary.

Yes, I agree that it's quite convoluted but, as Steven already said, rewriting 
it to use a more modern framework is hard because of SEQUENCEs and, frankly, I 
have neither the real incentive nor the time to start such an endeavor.

Instead I can raise the following couple of points that I also ran into with 
the private port I'm working on:

 1. When fill_simple_delay_slots is scanning backwards to find insns to put 
into slots, it calls update_reg_dead_note which collects REG_DEAD notes and 
puts them on the insn to be moved.  But emit_delay_sequence will later drop 
them on the floor.  I have one case where preserving such a REG_DEAD note is 
correct and makes a difference (slot filled vs empty).

 2. In resource.c, when mark_target_live_regs is doing its liveness analysis, 
it extracts the insns wrapped in USEs by reorg.c:

  /* If this insn is a USE made by update_block, we care about the
 underlying insn.  */
  if (code == INSN  GET_CODE (PATTERN (insn)) == USE
   INSN_P (XEXP (PATTERN (insn), 0)))
  real_insn = XEXP (PATTERN (insn), 0);

and proceeds with the liveness analysis without further ado.  Now, the story 
is different in find_dead_or_set_registers, which does:

  if (GET_CODE (PATTERN (insn)) == USE)
{
  /* If INSN is a USE made by update_block, we care about the
 underlying insn.  Any registers set by the underlying insn
 are live since the insn is being done somewhere else.  */
  if (INSN_P (XEXP (PATTERN (insn), 0)))
mark_set_resources (XEXP (PATTERN (insn), 0), res, 0,
MARK_SRC_DEST_CALL);

and thus pessimizes the analysis by making registers unnecessary live.  Again 
I have one case where not pessimizing is correct and makes a difference.

Experimental patch attached, clean on the C testsuite for the private port.

What do you think?

-- 
Eric BotcazouIndex: reorg.c
===
--- reorg.c	(revision 197617)
+++ reorg.c	(working copy)
@@ -549,6 +549,11 @@ emit_delay_sequence (rtx insn, rtx list,
 	  remove_note (tem, note);
 	  break;
 
+	case REG_DEP_TRUE:
+	  /* This was a REG_DEAD note that we want to preserve.  */
+	  PUT_REG_NOTE_KIND (note, REG_DEAD);
+	  break;
+
 	case REG_LABEL_OPERAND:
 	case REG_LABEL_TARGET:
 	  /* Keep the label reference count up to date.  */
@@ -1806,8 +1811,10 @@ update_reg_dead_notes (rtx insn, rtx del
 
 	if (reg_referenced_p (XEXP (link, 0), PATTERN (insn)))
 	  {
-	/* Move the REG_DEAD note from P to INSN.  */
+	/* Move the REG_DEAD note from P to INSN but smuggle it so that
+	   emit_delay_sequence doesn't drop it on the floor.  */
 	remove_note (p, link);
+	PUT_REG_NOTE_KIND (link, REG_DEP_TRUE);
 	XEXP (link, 1) = REG_NOTES (insn);
 	REG_NOTES (insn) = link;
 	  }
Index: resource.c
===
--- resource.c	(revision 197617)
+++ resource.c	(working copy)
@@ -425,6 +425,7 @@ find_dead_or_set_registers (rtx target,
   for (insn = target; insn; insn = next)
 {
   rtx this_jump_insn = insn;
+  rtx real_insn = insn;
 
   next = NEXT_INSN (insn);
 
@@ -443,7 +444,6 @@ find_dead_or_set_registers (rtx target,
 	  AND_COMPL_HARD_REG_SET (pending_dead_regs, needed.regs);
 	  AND_COMPL_HARD_REG_SET (res-regs, pending_dead_regs);
 	  CLEAR_HARD_REG_SET (pending_dead_regs);
-
 	  continue;
 
 	case BARRIER:
@@ -454,14 +454,12 @@ find_dead_or_set_registers (rtx target,
 	  if (GET_CODE (PATTERN (insn)) == USE)
 	{
 	  /* If INSN is a USE made by update_block, we care about the
-		 underlying insn.  Any registers set by the underlying insn
-		 are live since the insn is being done somewhere else.  */
+		 underlying insn.  */
 	  if (INSN_P (XEXP (PATTERN (insn), 0)))
-		mark_set_resources (XEXP (PATTERN (insn), 0), res, 0,
-MARK_SRC_DEST_CALL);
-
-	  /* All other USE insns are to be ignored.  */
-	  continue;
+		real_insn = XEXP (PATTERN (insn), 0);
+	  else
+		/* All other USE insns are to be ignored.  */
+		continue;
 	}
 	  else if (GET_CODE (PATTERN (insn)) == CLOBBER)
 	continue;
@@ -582,8 +580,8 @@ find_dead_or_set_registers (rtx target,
 	}
 	}
 
-  mark_referenced_resources (insn, needed, true);
-  mark_set_resources (insn, set, 0, MARK_SRC_DEST_CALL);
+  mark_referenced_resources (real_insn, needed, true);
+  mark_set_resources

Re: RFA: Fix tree-optimization/55524

2013-04-09 Thread Joern Rennecke


Quoting Richard Biener richard.guent...@gmail.com:


I don't see that.  It's merely a complication of optimal handling of
a * b +- c * d vs. just a * b +- c.  The pass does simple pattern matching
only, not doing a global optimal transform, so adding another special-case
is reasonable.  Special-casing just for single-use 2nd multiplication
simplifies the cases for example.


I have attached a version of the patch that uses this simpler test.
Currently bootstrapping / regtesting on i686-pc-linux-gnu .

gcc:
2013-04-09  Joern Rennecke joern.renne...@embecosm.com

PR tree-optimization/55524
* tree-ssa-math-opts.c
(convert_mult_to_fma): Don't use an fms construct
when we don't have an fms operation, but fnma, and it looks
likely that we'll be able to use the latter.

gcc/testsuite:
2013-04-09  Joern Rennecke joern.renne...@embecosm.com

PR tree-optimization/55524
* gcc.target/epiphany/fnma-1.c: New test.

Index: testsuite/gcc.target/epiphany/fnma-1.c
===
--- testsuite/gcc.target/epiphany/fnma-1.c  (revision 0)
+++ testsuite/gcc.target/epiphany/fnma-1.c  (working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+/* { dg-final { scan-assembler-times fmsub\[ \ta-zA-Z0-9\]*, 1 } } */
+
+float
+f (float ar, float ai, float br, float bi)
+{
+  return ar * br - ai * bi;
+}
Index: tree-ssa-math-opts.c
===
--- tree-ssa-math-opts.c(revision 197578)
+++ tree-ssa-math-opts.c(working copy)
@@ -2570,6 +2570,24 @@ convert_mult_to_fma (gimple mul_stmt, tr
  return false;
}
 
+  /* If the subtrahend (gimple_assign_rhs2 (use_stmt)) is computed
+by a MULT_EXPR that we'll visit later, we might be able to
+get a more profitable match with fnma.
+OTOH, if we don't, a negate / fma pair has likely lower latency
+that a mult / subtract pair.  */
+  if (use_code == MINUS_EXPR  !negate_p
+  gimple_assign_rhs1 (use_stmt) == result
+  optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
+  optab_handler (fnma_optab, TYPE_MODE (type)) != CODE_FOR_nothing)
+   {
+ tree rhs2 = gimple_assign_rhs2 (use_stmt);
+ gimple stmt2 = SSA_NAME_DEF_STMT (rhs2);
+
+ if (has_single_use (rhs2)
+  gimple_assign_rhs_code (stmt2) == MULT_EXPR)
+   return false;
+   }
+
   /* We can't handle a * b + a * b.  */
   if (gimple_assign_rhs1 (use_stmt) == gimple_assign_rhs2 (use_stmt))
return false;

Re: Fill more delay slots in conditional returns

2013-04-09 Thread Steven Bosscher

On Tue, Apr 9, 2013 at 6:15 PM, Eric Botcazou wrote:
 If you wanted to get ambitious, rewriting reorg.c to compute and use
 dependency links to drive its candidate selection instead of the insane
 tracking code it uses would be a huge step forward, both in terms of
 cleanliness.  It'd also help compile-time performance; we do a lot of
 work to track resources that would be totally unnecessary.

 Yes, I agree that it's quite convoluted but, as Steven already said, rewriting
 it to use a more modern framework is hard because of SEQUENCEs and, frankly, I
 have neither the real incentive nor the time to start such an endeavor.

I've actually picked up the idea again. This is just last week-end's
work so don't expect much of it yet ;-)

But I'm near the point where I want to see if I can actually make it
fill some slots from the containing basic block and pushing it through
verify_flow_info somehow.

After that, the hard parts: branches and annulling.

Ciao!
Steven

Index: reorg.c
===
--- reorg.c (revision 197610)
+++ reorg.c (working copy)
@@ -3847,6 +3847,10 @@
   free (uid_to_ruid);
   crtl-dbr_scheduled_p = true;
 }
+
+#include sched-deps-graph.c
+#include sched-dbr.c
+
 #endif /* DELAY_SLOTS */

 static bool
@@ -3865,6 +3869,7 @@
 rest_of_handle_delay_slots (void)
 {
 #ifdef DELAY_SLOTS
+  fill_delay_slots ();
   dbr_schedule (get_insns ());
 #endif
   return 0;
#include sched-int.h

/* ??? That means dbr_schedule would come to depend on INSN_SCHEDULING?!  */
//#ifdef INSN_SCHEDULING

/* Bah, sched-deps needs this.  That's a bug, but what the heck...  */
static struct haifa_sched_info sched_dbr_info =
{
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL, NULL,
  NULL, NULL,
  0, 0,
  NULL, NULL, NULL, NULL,
  NULL, NULL,
  0
};

static struct sched_deps_info_def dbr_sched_deps_info =
{
  NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
  0, 0, 0
};

static struct common_sched_info_def dbr_common_sched_info;

static void
fill_delay_slots_from_bb (basic_block bb)
{
  rtx head, tail;
  struct deps_desc tmp_deps;

  /* Compute dependencies.  */
  get_ebb_head_tail (bb, bb, head, tail);
  init_deps_global ();
  init_deps (tmp_deps, false);
  sched_analyze (tmp_deps, head, tail);
  free_deps (tmp_deps);
  finish_deps_global ();

  /* Print dependency information.  */
  if (dump_file  sched_dump)
{
  fprintf (sched_dump, ;;   --- forward dependences:  \n);
  fprintf (sched_dump, \n;;   --- DBR Dependences --- for bb%d \n, bb-index);
  debug_dependencies (head, tail);
}

  print_dependencies_graph (stderr, cfun, bb);

  rtx insn;
  FOR_BB_INSNS (bb, insn)
{
  if (! NONDEBUG_INSN_P (insn)
	  || (GET_CODE (PATTERN (insn)) == USE
	  || GET_CODE (PATTERN (insn)) == CLOBBER))
	continue;

  INSN_FROM_TARGET_P (insn) = 0;
  if (JUMP_P (insn))
	INSN_ANNULLED_BRANCH_P (insn) = 0;
  int slots_to_fill = num_delay_slots (insn);
  if (slots_to_fill  0)
{
	  int flags;
	  if (JUMP_P (insn))
	flags = get_jump_flags (insn, JUMP_LABEL (insn));
	  else
	flags = get_jump_flags (insn, NULL_RTX);
	  fprintf (stderr,
		   // insn %d has %d delay slots to fill\n,
		   INSN_UID (insn), slots_to_fill);
	  for (rtx cand_insn = PREV_INSN (insn);
	   cand_insn != BB_HEAD (bb);
	   cand_insn = PREV_INSN (cand_insn))
	{
	  sd_iterator_def sd_it;
	  dep_t dep;
	  int n_forw_deps = 0;
	  bool cand_feeds_call_insn = false;
	  bool cand_deps_cross_insn = false;

	  /* TRIAL must be a CALL_INSN or INSN.  Skip USE and CLOBBER.  */
	  if (! NONDEBUG_INSN_P (cand_insn)
		  || (GET_CODE (PATTERN (cand_insn)) == USE
		  || GET_CODE (PATTERN (cand_insn)) == CLOBBER))
		continue;

	  if (! eligible_for_delay (insn, 0, cand_insn, flags))
		{
		  fprintf (stderr,
			   //\tinsn %d not eligible for delay slots of insn %d\n,
			   INSN_UID (cand_insn), INSN_UID (insn));
		  continue;
		}

	  rtx set = single_set (cand_insn);
	  if (! set || multiple_sets (cand_insn))
		continue;

	  FOR_EACH_DEP (cand_insn, SD_LIST_FORW, sd_it, dep)
		{
		  n_forw_deps++;
		  if (DF_INSN_LUID (insn)  DF_INSN_LUID (DEP_CON (dep)))
		cand_deps_cross_insn = true;
		  else if (DF_INSN_LUID (insn) == DF_INSN_LUID (DEP_CON (dep))
			CALL_P (insn)  df_reg_used (insn, SET_DEST (set)))
		cand_feeds_call_insn = true;
		  else // not a deps leaf, not a call arg setter
		{
		  fprintf (stderr,
			   //\tinsn %d is not suitable, blocked by insn %d\n,
			   INSN_UID (cand_insn), INSN_UID (DEP_CON (dep)));
		  n_forw_deps = -1;
		  break;
		}
		}

	  if (n_forw_deps  0)
		continue;

	  if (n_forw_deps == 0)
		{
		  fprintf (stderr,
			   //\tinsn %d is an obvious candidate, dependencies leaf\n,
			   INSN_UID (cand_insn));
		  fprintf (stderr,
			   \tfn_%d_insn_uid_%d:e -

Re: [google gcc-4_7] offline profile merge (patchset 2) (issue8508048)

2013-04-09 Thread Xinliang David Li

Ok for google branches.

thanks,

David

On Mon, Apr 8, 2013 at 3:44 PM, Rong Xu x...@google.com wrote:
 Revised copyright info.

 -Rong

 2013-04-08  Rong Xu  x...@google.com

 * contrib/profile_merge.py: An offline profile merge tool.

 Index: contrib/profile_merge.py
 ===
 --- contrib/profile_merge.py(revision 0)
 +++ contrib/profile_merge.py(revision 0)
 @@ -0,0 +1,1320 @@
 +#!/usr/bin/python2.7
 +#
 +#Copyright (C) 2013
 +#Free Software Foundation, Inc.
 +#
 +# This file is part of GCC.
 +#
 +# GCC is free software; you can redistribute it and/or modify
 +# it under the terms of the GNU General Public License as published by
 +# the Free Software Foundation; either version 3, or (at your option)
 +# any later version.
 +#
 +# GCC is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with GCC; see the file COPYING3.  If not see
 +# http://www.gnu.org/licenses/.
 +#
 +
 +
 +Merge two or more gcda profile.
 +
 +
 +__author__ = 'Seongbae Park, Rong Xu'
 +__author_email__ = 'sp...@google.com, x...@google.com'
 +
 +import array
 +from optparse import OptionGroup
 +from optparse import OptionParser
 +import os
 +import struct
 +import zipfile
 +
 +new_histogram = None
 +
 +
 +class Error(Exception):
 +  Exception class for profile module.
 +
 +
 +def ReadAllAndClose(path):
 +  Return the entire byte content of the specified file.
 +
 +  Args:
 +path: The path to the file to be opened and read.
 +
 +  Returns:
 +The byte sequence of the content of the file.
 +  
 +  data_file = open(path, 'rb')
 +  data = data_file.read()
 +  data_file.close()
 +  return data
 +
 +
 +def MergeCounters(objs, index, multipliers):
 +  Accumulate the counter at index from all counters objs.
 +  val = 0
 +  for j in xrange(len(objs)):
 +val += multipliers[j] * objs[j].counters[index]
 +  return val
 +
 +
 +class DataObject(object):
 +  Base class for various datum in GCDA/GCNO file.
 +
 +  def __init__(self, tag):
 +self.tag = tag
 +
 +
 +class Function(DataObject):
 +  Function and its counters.
 +
 +  Attributes:
 +length: Length of the data on the disk
 +ident: Ident field
 +line_checksum: Checksum of the line number
 +cfg_checksum: Checksum of the control flow graph
 +counters: All counters associated with the function
 +file: The name of the file the function is defined in. Optional.
 +line: The line number the function is defined at. Optional.
 +
 +  Function object contains other counter objects and block/arc/line objects.
 +  
 +
 +  def __init__(self, reader, tag, n_words):
 +Read function record information from a gcda/gcno file.
 +
 +Args:
 +  reader: gcda/gcno file.
 +  tag: funtion tag.
 +  n_words: length of function record in unit of 4-byte.
 +
 +DataObject.__init__(self, tag)
 +self.length = n_words
 +self.counters = []
 +
 +if reader:
 +  pos = reader.pos
 +  self.ident = reader.ReadWord()
 +  self.line_checksum = reader.ReadWord()
 +  self.cfg_checksum = reader.ReadWord()
 +
 +  # Function name string is in gcno files, but not
 +  # in gcda files. Here we make string reading optional.
 +  if (reader.pos - pos)  n_words:
 +reader.ReadStr()
 +
 +  if (reader.pos - pos)  n_words:
 +self.file = reader.ReadStr()
 +self.line_number = reader.ReadWord()
 +  else:
 +self.file = ''
 +self.line_number = 0
 +else:
 +  self.ident = 0
 +  self.line_checksum = 0
 +  self.cfg_checksum = 0
 +  self.file = None
 +  self.line_number = 0
 +
 +  def Write(self, writer):
 +Write out the function.
 +
 +writer.WriteWord(self.tag)
 +writer.WriteWord(self.length)
 +writer.WriteWord(self.ident)
 +writer.WriteWord(self.line_checksum)
 +writer.WriteWord(self.cfg_checksum)
 +for c in self.counters:
 +  c.Write(writer)
 +
 +  def EntryCount(self):
 +Return the number of times the function called.
 +return self.ArcCounters().counters[0]
 +
 +  def Merge(self, others, multipliers):
 +Merge all functions in others into self.
 +
 +Args:
 +  others: A sequence of Function objects
 +  multipliers: A sequence of integers to be multiplied during merging.
 +
 +for o in others:
 +  assert self.ident == o.ident
 +  assert self.line_checksum == o.line_checksum
 +  assert self.cfg_checksum == o.cfg_checksum
 +
 +for i in xrange(len(self.counters)):
 +  self.counters[i].Merge([o.counters[i] for o in others], multipliers)
 +
 +  def Print(self):
 +Print all the attributes in full detail.
 +print 'function: ident %d length %d line_chksum %x

Re: [patch] PR middle-end/43631

2013-04-09 Thread Steven Bosscher

Premature ping?
Also bootstrappedtested on ia64-unknown-linux-gnu now.

On Sun, Apr 7, 2013 at 12:40 AM, Steven Bosscher wrote:
 Hello,

 In this PR43631, var-tracking notes are inserted on basic block
 boundaries and BB_HEAD/BB_END end up pointing to these notes. This
 breaks verify_flow_info.

 The problem is actually bigger than just the var-tracking notes. The
 general problem is that there are no rules for whether or not notes
 are part of a basic block or not. Some notes never appear inside a
 basic block, some notes always must appear inside a basic block, and
 some can appear virtually anywhere.

 With this patch I've chosen to maintain the invariant that BB_END must
 be an INSN, and never be a NOTE or BARRIER. The result is that
 NOTE_INSN_EH_REGION_BEG notes can be inside a basic block while the
 pairing NOTE_INSN_EH_REGION_END is outside the basic block. I don't
 think this is a problem: The same thing already happens with jump
 table data, barriers, etc.

 The nice thing is that with this patch, *finally* I have
 verify_flow_info pass after var-tracking. That's important because
 officially the CFG is freed after this pass (pass_free_cfg runs after
 pass_variable_tracking) but because verify_flow_info didn't pass after
 var-tracking the CFG was already invalidated before it was freed
 (BB_HEAD and BB_END would be incorrect). That has the effect that some
 of the machine-reorg passes that use the CFG never really had a
 correct CFG at all!

 With this patch and a hack to call compute_bb_for_insn, I now can call
 verify_flow_info in every pass up to final for the i386 port :-)

 Bootstrappedtested on x86_64-unknown-linux-gnu and 
 powerpc64-unknown-linux-gnu.
 OK for trunk?

 Ciao!
 Steven

Re: [patch libgcc]: Adjust cygming-crtbegin code to use weak

2013-04-09 Thread Dave Korn

On 09/04/2013 11:37, Kai Tietz wrote:

 Hmm, well in standard-case you are right.  But well there is still a
 chance that GetProcAddress returns NULL-pointer ...  

  How would that actually happen?  Removing any of those functions from libgcc
or libjava would be a very serious ABI breakage; we're never going to do that.
 (And even if we do, the version number of the DLL would have to change and
LoadLibrary wouldn't return anything, unless we changed the shared lib string,
which would be the appropriate time to re-add a check.)  It's not critical,
but it's wasted code.

cheers,
  DaveK

Re: [google gcc-4_7] offline profile merge tool (issue 8508048)

2013-04-09 Thread martint


Have not had a chance to look at the whole patch yet, but here are some
initial comments.


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py
File contrib/profile_merge.py (right):

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode53
contrib/profile_merge.py:53: data_file = open(path, 'rb')
How about simpler version:

with open(file, 'rb') as f:
  data = f.read()

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode59
contrib/profile_merge.py:59: def MergeCounters(objs, index,
multipliers):
Perhaps use function name that shows that it returns the value.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode78
contrib/profile_merge.py:78: length: Length of the data on the disk
Minor style. Make sure format is coherent (ends with . or not, starts
with capital letter or not).

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode119
contrib/profile_merge.py:119: self.ident = 0
It is ever useful to have a function object with no reader?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode138
contrib/profile_merge.py:138: return self.ArcCounters().counters[0]
This will throw an exception if no counters exists. Is that expected?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode209
contrib/profile_merge.py:209: class Lines(DataObject):
Where is this used?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1088
contrib/profile_merge.py:1088: dir: A path to the directory containing
the gcda/gcno.
A cleaner way could perhaps be to have a base profile archive class and
then classes for ProfileArchieveDir and ProfileArchiveZip?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1153
contrib/profile_merge.py:1153: os.chdir(direc)
How about using absolute paths to avoid the chdir logic?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1199
contrib/profile_merge.py:1199: outfile_path = output_path + '/' + f
os.path.join() instead of +, here and other places where the pattern is
used.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1213
contrib/profile_merge.py:1213: def Merge(self, archives, multipliers):
The code seems to handle both the case when 'self' is inside archives
and when it is not. Does both happen when running the code?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1233
contrib/profile_merge.py:1233: for i, a in enumerate(archives):
It's hard to see what i is in this case. I would prefer to have
information about arguments in the documentation at the top of the
function, but more descriptive variable names and comments would also
help.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1288
contrib/profile_merge.py:1288: group = OptionGroup(parser, 'Arguments',
You don't seem to add any options to the group? It seems not necessary
to use a group at all in this program.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1313
contrib/profile_merge.py:1313: input_profiles[0].Merge(input_profiles,
profile_multipliers)
Did you conside creating a new object instead of modifying the existing?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1317
contrib/profile_merge.py:1317: input_profiles[0].Write(output_profile)
Warn before overwriting?

https://codereview.appspot.com/8508048/

Re: RFC: add some static probes to libstdc++

2013-04-09 Thread Tom Tromey

 Jonathan == Jonathan Wakely jwakely@gmail.com writes:

Marc I thought you were going to suggest enhancing the configure test
Marc so it fails on old systemtap (detects it as absent).

Jonathan Ah yes, that's a much better idea!

Here's a patch to do that.

I tested it on x86-64 Fedora 18.  I tested the failing path by using
an old sdt.h.  This required a number of iterations to ensure that the
test failed for the right reasons.

Ok?  I think it should go on the 4.8 branch as well.

Tom

2013-04-09  Tom Tromey  tro...@redhat.com

* configure, config.h.in: Rebuild.
* configure.ac: Use GLIBCXX_CHECK_SDT_H.  Don't check for
sys/sdt.h.
* acinclude.m4 (GLIBCXX_CHECK_SDT_H): New defun.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 4d06b20..619fff0 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3660,6 +3660,36 @@ AC_DEFUN([GLIBCXX_ENABLE_WERROR], [
 ])
 
 
+dnl
+dnl Check to see if sys/sdt.h exists and that it is suitable for use.
+dnl Some versions of sdt.h were not compatible with C++11.
+dnl
+AC_DEFUN([GLIBCXX_CHECK_SDT_H], [
+  AC_MSG_RESULT([for suitable sys/sdt.h])
+  # Note that this test has to be run with the C language.
+  # Otherwise, sdt.h will try to include some headers from
+  # libstdc++ itself.
+  AC_LANG_SAVE
+  AC_LANG_C
+  AC_CACHE_VAL(glibcxx_cv_sys_sdt_h, [
+# Because we have to run the test in C, we use grep rather
+# than the compiler to check for the bug.  The bug is that
+# were strings without trailing whitespace, causing g++
+# to look for operator.  The pattern searches for the fixed
+# output.
+AC_EGREP_CPP([ \,\ ], [
+  #include sys/sdt.h
+  int f() { STAP_PROBE(hi, bob); }
+], [glibcxx_cv_sys_sdt_h=yes], [glibcxx_cv_sys_sdt_h=no])
+  ])
+  AC_LANG_RESTORE
+  if test $glibcxx_cv_sys_sdt_h = yes; then
+AC_DEFINE(HAVE_SYS_SDT_H, 1,
+  [Define to 1 if you have a suitable sys/sdt.h header file])
+  fi
+  AC_MSG_RESULT($glibcxx_cv_sys_sdt_h)
+])
+
 # Macros from the top-level gcc directory.
 m4_include([../config/gc++filt.m4])
 m4_include([../config/tls.m4])
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index de66406..73d430a 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -211,12 +211,13 @@ GLIBCXX_CHECK_SC_NPROCESSORS_ONLN
 GLIBCXX_CHECK_SC_NPROC_ONLN
 GLIBCXX_CHECK_PTHREADS_NUM_PROCESSORS_NP
 GLIBCXX_CHECK_SYSCTL_HW_NCPU
+GLIBCXX_CHECK_SDT_H
 
 # Check for available headers.
 AC_CHECK_HEADERS([endian.h execinfo.h float.h fp.h ieeefp.h inttypes.h \
 locale.h machine/endian.h machine/param.h nan.h stdint.h stdlib.h string.h \
 strings.h sys/ipc.h sys/isa_defs.h sys/machine.h sys/param.h \
-sys/resource.h sys/sdt.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \
+sys/resource.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \
 wchar.h wctype.h])
 
 # Only do link tests if native. Else, hardcode.

Re: Fill more delay slots in conditional returns

2013-04-09 Thread Jeff Law


On 04/09/2013 10:15 AM, Eric Botcazou wrote:


Yes, I agree that it's quite convoluted but, as Steven already said, rewriting
it to use a more modern framework is hard because of SEQUENCEs and, frankly, I
have neither the real incentive nor the time to start such an endeavor.

I understand completely.




Instead I can raise the following couple of points that I also ran into with
the private port I'm working on:

  1. When fill_simple_delay_slots is scanning backwards to find insns to put
into slots, it calls update_reg_dead_note which collects REG_DEAD notes and
puts them on the insn to be moved.  But emit_delay_sequence will later drop
them on the floor.  I have one case where preserving such a REG_DEAD note is
correct and makes a difference (slot filled vs empty).
I don't recall ever working on this aspect of reorg.  The obvious worry 
is that with reorg moving stuff around those notes may not be valid 
anymore in the general case.


This is further complicated by the fact that reorg mucks up the CFG 
enough that correct, incremental updates of life information might be 
exceedingly hard.






  2. In resource.c, when mark_target_live_regs is doing its liveness analysis,
it extracts the insns wrapped in USEs by reorg.c:

  /* If this insn is a USE made by update_block, we care about the
 underlying insn.  */
  if (code == INSN  GET_CODE (PATTERN (insn)) == USE
   INSN_P (XEXP (PATTERN (insn), 0)))
  real_insn = XEXP (PATTERN (insn), 0);

and proceeds with the liveness analysis without further ado.  Now, the story
is different in find_dead_or_set_registers, which does:
Just a note, there's a formatting error in this code.  The real_insn = 
statement is indented too far.  Feel free to fix that :-)


This looks like the safe/conservative thing for this code.  We're 
marking when regs become live.  Marking something live too early doesn't 
hurt here (from a correctness standpoint) as far as I can tell.


Note that deaths are deferred until the next label.





  if (GET_CODE (PATTERN (insn)) == USE)
{
  /* If INSN is a USE made by update_block, we care about the
 underlying insn.  Any registers set by the underlying insn
 are live since the insn is being done somewhere else.  */
  if (INSN_P (XEXP (PATTERN (insn), 0)))
mark_set_resources (XEXP (PATTERN (insn), 0), res, 0,
MARK_SRC_DEST_CALL);

and thus pessimizes the analysis by making registers unnecessary live.  Again
I have one case where not pessimizing is correct and makes a difference.
Seems reasonable.  This might just be an oversight by the original 
author.  Effectively we're going to start marking the referenced 
resources with your patch.  We still defer killing as far as I can tell.






Experimental patch attached, clean on the C testsuite for the private port.

What do you think?
If you want to run with it, I won't object.  My worry would be we might 
not see the fallout for a long time as the number of things that have to 
fall into place for an observable failure here are very high.



Jeff

[google] Avoid warning when unused attribute applied to C++ member variables (issue8580044)

2013-04-09 Thread Teresa Johnson

This patch is pending on trunk, but I would like to get this into google
branches now as it is causing spurious warnings.

Google ref b/8496800.

This patch allows the unused attribute to be used without warning
on C++ class members, which are of type FIELD_DECL. This is for
compatibility with clang, which allows the attribute to be specified on
class members and struct fields. It looks like more work would need to
be done to implement the actual unused variable detection and warning
on FIELD_DECLs, but this change will at least avoid the warning on the
code that uses the unused attribute in these cases. The documentation at
http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html also doesn't
seem to preclude its use on C++ member variables.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for google branches?

2013-04-09  Teresa Johnson  tejohn...@google.com

* c-family/c-common.c (handle_unused_attribute):

Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 197640)
+++ c-family/c-common.c (working copy)
@@ -6320,6 +6320,7 @@ handle_unused_attribute (tree *node, tree name, tr
 
   if (TREE_CODE (decl) == PARM_DECL
  || TREE_CODE (decl) == VAR_DECL
+ || TREE_CODE (decl) == FIELD_DECL
  || TREE_CODE (decl) == FUNCTION_DECL
  || TREE_CODE (decl) == LABEL_DECL
  || TREE_CODE (decl) == TYPE_DECL)

--
This patch is available for review at http://codereview.appspot.com/8580044

C++ PATCH for c++/25466 (runtime SEGV with typeid)

2013-04-09 Thread Jason Merrill

The standard is somewhat unclear here, but I think what makes sense is 
to always check whether the address is null rather than confine the 
check to when the immediate operand of typeid is an INDIRECT_REF.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 43d5a64843e3e18a74e59b1b368cd129d5f1de88
Author: Jason Merrill ja...@redhat.com
Date:   Tue Apr 9 14:11:01 2013 -0400

	PR c++/25466
	* rtti.c (build_typeid): Check the address of the argument
	rather than looking for an INDIRECT_REF.

diff --git a/gcc/cp/rtti.c b/gcc/cp/rtti.c
index e83d666..b3c6687 100644
--- a/gcc/cp/rtti.c
+++ b/gcc/cp/rtti.c
@@ -326,18 +326,16 @@ build_typeid (tree exp, tsubst_flags_t complain)
 
   /* FIXME when integrating with c_fully_fold, mark
  resolves_to_fixed_type_p case as a non-constant expression.  */
-  if (INDIRECT_REF_P (exp)
-   TYPE_PTR_P (TREE_TYPE (TREE_OPERAND (exp, 0)))
-   TYPE_POLYMORPHIC_P (TREE_TYPE (exp))
+  if (TYPE_POLYMORPHIC_P (TREE_TYPE (exp))
! resolves_to_fixed_type_p (exp, nonnull)
! nonnull)
 {
   /* So we need to look into the vtable of the type of exp.
- This is an lvalue use of expr then.  */
-  exp = mark_lvalue_use (exp);
+ Make sure it isn't a null lvalue.  */
+  exp = cp_build_addr_expr (exp, complain);
   exp = stabilize_reference (exp);
-  cond = cp_convert (boolean_type_node, TREE_OPERAND (exp, 0),
-			 complain);
+  cond = cp_convert (boolean_type_node, exp, complain);
+  exp = cp_build_indirect_ref (exp, RO_NULL, complain);
 }
 
   exp = get_tinfo_decl_dynamic (exp, complain);
diff --git a/gcc/testsuite/g++.dg/rtti/typeid10.C b/gcc/testsuite/g++.dg/rtti/typeid10.C
new file mode 100644
index 000..47b45b1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/rtti/typeid10.C
@@ -0,0 +1,36 @@
+// PR c++/25466
+// { dg-do run }
+
+#include typeinfo
+
+const std::type_info *a;
+
+template class T
+bool is_polymorphic() {
+   bool result(false);
+   const std::type_info a1 = typeid( (result=true), *(T*)0);
+   a = a1;
+   return result;
+}
+
+struct non_polymorphic {};
+struct polymorphic { virtual ~polymorphic() {} };
+
+
+int main() {
+  if (is_polymorphicint()) __builtin_abort();
+  if (is_polymorphicnon_polymorphic()) __builtin_abort();
+  try
+{
+  is_polymorphicpolymorphic();
+  __builtin_abort(); // should have thrown bad_typeid
+}
+  catch (std::bad_typeid)
+{
+  // OK
+}
+  catch (...)
+{
+  __builtin_abort();
+}
+}

[Patch, fortran] PR 40958 Compress module files with zlib

2013-04-09 Thread Janne Blomqvist

Hi,

the attached patch reduces the size of module files on disk by
compressing them with zlib and storing them in the gzip format (RFC
1952). I chose zlib because it's a) ubiquitous and b) there's already
a copy of zlib in the GCC source tree, so this doesn't introduce any
further build
dependencies. Since the mod files with the patch are in the gzip
format, one can use tools like zcat, zless, zgrep, zdiff etc. to
inspect the uncompressed contents easily.

As the gzip file format already contains a CRC32 checksum, the patch
gets rid of the current MD5 checksumming, and instead compares the
CRC32 to determine whether a mod file should be replaced or not. As
the gz* functions can also handle uncompressed files, and because I
replaced the line with the MD5 sum with an empty line, there was no
need to bump the module version number.

Based on some testing with CP2K_2009-05-01.f90 (a single file
containing all of CP2K), the compile time with -fsyntax-only increased
from about 46 to 48 seconds, and the total size of the module files
were reduced from 124 MB to 13 MB.

(IMHO the increase in compile time is modest enough that it's not
worth doing the caching of uncompressed module files that I was
previously thinking about, especially considering that large projects
are invariably(?!) split over several source files, thereby reducing
the effectiveness of such a caching scheme.)

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2013-04-09  Janne Blomqvist  j...@gcc.gnu.org

PR fortran/40958
* scanner.h: New file.
* Make-lang.in: Dependencies on scanner.h.
* scanner.c (gfc_directorylist): Move to scanner.h.
* module.c: Don't include md5.h, include scanner.h and zlib.h.
(MOD_VERSION): Add comment about backwards compatibility.
(module_fp): Change type to gzFile.
(ctx): Remove.
(gzopen_included_file_1): New function.
(gzopen_included_file): New function.
(gzopen_intrinsic_module): New function.
(write_char): Use gzputc.
(read_crc32_from_module_file): New function.
(read_md5_from_module_file): Remove.
(gfc_dump_module): Use gz* functions instead of stdio, check gzip
crc32 instead of md5.
(read_module_to_tmpbuf): Use gz* functions instead of stdio.
(gfc_use_module): Use gz* functions.


testsuite:

2013-04-09  Janne Blomqvist  j...@gcc.gnu.org

PR fortran/40958
* lib/gcc-dg.exp (scan-module): Uncompress module file before
scanning.
* gfortran.dg/module_md5_1.f90: Remove.


--
Janne Blomqvist


modzlib.diff
Description: Binary data

Re: RFC: add some static probes to libstdc++

2013-04-09 Thread Jonathan Wakely

On 9 April 2013 18:47, Tom Tromey wrote:
 Jonathan == Jonathan Wakely jwakely@gmail.com writes:

 Marc I thought you were going to suggest enhancing the configure test
 Marc so it fails on old systemtap (detects it as absent).

 Jonathan Ah yes, that's a much better idea!

 Here's a patch to do that.

 I tested it on x86-64 Fedora 18.  I tested the failing path by using
 an old sdt.h.  This required a number of iterations to ensure that the
 test failed for the right reasons.

Ah yes, tricky.

 Ok?  I think it should go on the 4.8 branch as well.

OK for trunk and 4.8, thanks.

Re: [i386] Replace builtins with vector extensions

2013-04-09 Thread Marc Glisse


On Tue, 9 Apr 2013, Marc Glisse wrote:


On Tue, 9 Apr 2013, Richard Biener wrote:


I seem to remember discussion in the PR(s) that the intrinsics should
(and do for other compilers) expand to the desired instructions even when
the corresponding instruction set is disabled.


emmintrin.h starts with:
#ifndef __SSE2__
# error SSE2 instruction set not enabled


Oh, re-reading your post, it looks like you mean we should change the 
current behavior, not just avoid regressions...


My opinion on the intrinsics is that they are the portable way to use 
vectors on x86, but they are not equivalent to asm (which people should 
use if they don't want the compiler looking at their code). Knowingly 
generating SSE code with -mno-sse is not very appealing.


However, the arguments in:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56298
make sense. I guess I'll forget about this patch.

--
Marc Glisse

Re: [google] Avoid warning when unused attribute applied to C++ member variables (issue8580044)

2013-04-09 Thread Xinliang David Li

Ok -- there are legitimate use cases for this.

David

On Tue, Apr 9, 2013 at 11:01 AM, Teresa Johnson tejohn...@google.com wrote:
 This patch is pending on trunk, but I would like to get this into google
 branches now as it is causing spurious warnings.

 Google ref b/8496800.

 This patch allows the unused attribute to be used without warning
 on C++ class members, which are of type FIELD_DECL. This is for
 compatibility with clang, which allows the attribute to be specified on
 class members and struct fields. It looks like more work would need to
 be done to implement the actual unused variable detection and warning
 on FIELD_DECLs, but this change will at least avoid the warning on the
 code that uses the unused attribute in these cases. The documentation at
 http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html also doesn't
 seem to preclude its use on C++ member variables.

 Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for google branches?

 2013-04-09  Teresa Johnson  tejohn...@google.com

 * c-family/c-common.c (handle_unused_attribute):

 Index: c-family/c-common.c
 ===
 --- c-family/c-common.c (revision 197640)
 +++ c-family/c-common.c (working copy)
 @@ -6320,6 +6320,7 @@ handle_unused_attribute (tree *node, tree name, tr

if (TREE_CODE (decl) == PARM_DECL
   || TREE_CODE (decl) == VAR_DECL
 + || TREE_CODE (decl) == FIELD_DECL
   || TREE_CODE (decl) == FUNCTION_DECL
   || TREE_CODE (decl) == LABEL_DECL
   || TREE_CODE (decl) == TYPE_DECL)

 --
 This patch is available for review at http://codereview.appspot.com/8580044

[google gcc-4_7] offline profile tool (patchset 3) (issue8508048)

2013-04-09 Thread Rong Xu

Hi,

Patch set 3 for offline profile merge tool. 
(1) Rename to profile_tool for future function expansion. 
(2) Install to bin/ directory for easy user access.

Thanks,

-Rong

2013-04-09  Rong Xu  x...@google.com

* gcc/Makefile.in: Install profile_tool to bin directory.
* contrib/profile_tool: A set of offline tools to
analyze/manipulate gcov profiles. Current version only supports 
profile merge.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 197613)
+++ gcc/Makefile.in (working copy)
@@ -752,6 +752,7 @@ GCC_INSTALL_NAME := $(shell echo gcc|sed '$(progra
 GCC_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo gcc|sed 
'$(program_transform_name)')
 CPP_INSTALL_NAME := $(shell echo cpp|sed '$(program_transform_name)')
 GCOV_INSTALL_NAME := $(shell echo gcov|sed '$(program_transform_name)')
+PROFILE_TOOL_INSTALL_NAME := $(shell echo profile_tool|sed 
'$(program_transform_name)')
 
 # Setup the testing framework, if you have one
 EXPECT = `if [ -f $${rootme}/../expect/expect ] ; then \
@@ -4691,6 +4692,13 @@ install-common: native lang.install-common install
rm -f $(DESTDIR)$(bindir)/$(GCOV_INSTALL_NAME)$(exeext); \
$(INSTALL_PROGRAM) gcov$(exeext) 
$(DESTDIR)$(bindir)/$(GCOV_INSTALL_NAME)$(exeext); \
fi
+# Install profile_tool if it is available.
+   -if [ -f $(srcdir)/../contrib/profile_tool ]; \
+   then \
+   rm -f $(DESTDIR)$(bindir)/$(PROFILE_TOOL_INSTALL_NAME)$(exeext); \
+   $(INSTALL_PROGRAM) $(srcdir)/../contrib/profile_tool \
+   $(DESTDIR)$(bindir)/$(PROFILE_TOOL_INSTALL_NAME)$(exeext); \
+   fi
 
 # Install the driver program as $(target_noncanonical)-gcc,
 # $(target_noncanonical)-gcc-$(version)
Index: contrib/profile_tool
===
--- contrib/profile_tool(revision 0)
+++ contrib/profile_tool(revision 0)
@@ -0,0 +1,1320 @@
+#!/usr/bin/python2.7
+#
+#Copyright (C) 2013
+#Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# http://www.gnu.org/licenses/.
+#
+
+
+Merge two or more gcda profile.
+
+
+__author__ = 'Seongbae Park, Rong Xu'
+__author_email__ = 'sp...@google.com, x...@google.com'
+
+import array
+from optparse import OptionGroup
+from optparse import OptionParser
+import os
+import struct
+import zipfile
+
+new_histogram = None
+
+
+class Error(Exception):
+  Exception class for profile module.
+
+
+def ReadAllAndClose(path):
+  Return the entire byte content of the specified file.
+
+  Args:
+path: The path to the file to be opened and read.
+
+  Returns:
+The byte sequence of the content of the file.
+  
+  data_file = open(path, 'rb')
+  data = data_file.read()
+  data_file.close()
+  return data
+
+
+def MergeCounters(objs, index, multipliers):
+  Accumulate the counter at index from all counters objs.
+  val = 0
+  for j in xrange(len(objs)):
+val += multipliers[j] * objs[j].counters[index]
+  return val
+
+
+class DataObject(object):
+  Base class for various datum in GCDA/GCNO file.
+
+  def __init__(self, tag):
+self.tag = tag
+
+
+class Function(DataObject):
+  Function and its counters.
+
+  Attributes:
+length: Length of the data on the disk
+ident: Ident field
+line_checksum: Checksum of the line number
+cfg_checksum: Checksum of the control flow graph
+counters: All counters associated with the function
+file: The name of the file the function is defined in. Optional.
+line: The line number the function is defined at. Optional.
+
+  Function object contains other counter objects and block/arc/line objects.
+  
+
+  def __init__(self, reader, tag, n_words):
+Read function record information from a gcda/gcno file.
+
+Args:
+  reader: gcda/gcno file.
+  tag: funtion tag.
+  n_words: length of function record in unit of 4-byte.
+
+DataObject.__init__(self, tag)
+self.length = n_words
+self.counters = []
+
+if reader:
+  pos = reader.pos
+  self.ident = reader.ReadWord()
+  self.line_checksum = reader.ReadWord()
+  self.cfg_checksum = reader.ReadWord()
+
+  # Function name string is in gcno files, but not
+  # in gcda files. Here we make string reading optional.
+  if (reader.pos - pos)  n_words:
+

Re: [google gcc-4_7] offline profile merge tool (issue 8508048)

2013-04-09 Thread xur



https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py
File contrib/profile_merge.py (right):

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode53
contrib/profile_merge.py:53: data_file = open(path, 'rb')
On 2013/04/09 17:32:11, martint wrote:

How about simpler version:



with open(file, 'rb') as f:
   data = f.read()


Done.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode59
contrib/profile_merge.py:59: def MergeCounters(objs, index,
multipliers):
changed the name to ReturnMergedCounters

On 2013/04/09 17:32:11, martint wrote:

Perhaps use function name that shows that it returns the value.


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode78
contrib/profile_merge.py:78: length: Length of the data on the disk
All the bullets sub-to Attributes are not capital.

On 2013/04/09 17:32:11, martint wrote:

Minor style. Make sure format is coherent (ends with . or not, starts

with

capital letter or not).


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode119
contrib/profile_merge.py:119: self.ident = 0
Probably not. 'reader' here is ProfileDataFile. We should have a valid
gcda/gcno file for this program. But it does not hurt to have a empty
reader. I'll leave as it's.

On 2013/04/09 17:32:11, martint wrote:

It is ever useful to have a function object with no reader?


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode138
contrib/profile_merge.py:138: return self.ArcCounters().counters[0]
that will not happen. all functions have at least one counter.

On 2013/04/09 17:32:11, martint wrote:

This will throw an exception if no counters exists. Is that expected?


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode209
contrib/profile_merge.py:209: class Lines(DataObject):
It's for reading in gcno file. We are not using this functionality for
merge.

On 2013/04/09 17:32:11, martint wrote:

Where is this used?


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1153
contrib/profile_merge.py:1153: os.chdir(direc)
On 2013/04/09 17:32:11, martint wrote:

How about using absolute paths to avoid the chdir logic?


Done.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1199
contrib/profile_merge.py:1199: outfile_path = output_path + '/' + f
On 2013/04/09 17:32:11, martint wrote:

os.path.join() instead of +, here and other places where the pattern

is used.

Done.

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1213
contrib/profile_merge.py:1213: def Merge(self, archives, multipliers):
The caller in this program always has self in archives set.

On 2013/04/09 17:32:11, martint wrote:

The code seems to handle both the case when 'self' is inside archives

and when

it is not. Does both happen when running the code?


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1233
contrib/profile_merge.py:1233: for i, a in enumerate(archives):
i is the i-th profile archive.

On 2013/04/09 17:32:11, martint wrote:

It's hard to see what i is in this case. I would prefer to have

information

about arguments in the documentation at the top of the function, but

more

descriptive variable names and comments would also help.


https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1313
contrib/profile_merge.py:1313: input_profiles[0].Merge(input_profiles,
profile_multipliers)
That will use more memory. We already use too much memory.
And it only modifies in-memory object (the profile files are not
changed).

On 2013/04/09 17:32:11, martint wrote:

Did you conside creating a new object instead of modifying the

existing?

https://codereview.appspot.com/8508048/diff/4001/contrib/profile_merge.py#newcode1317
contrib/profile_merge.py:1317: input_profiles[0].Write(output_profile)
Refer to previous comment. Only in-memory objects are overwritten.

On 2013/04/09 17:32:11, martint wrote:

Warn before overwriting?


https://codereview.appspot.com/8508048/

[Debug, Fortran] RFC patch for DW_TAG_namelist (PR fortran/37132)

2013-04-09 Thread Tobias Burnus


Dear all,

attached is a first attempt to implement DW_TAG_namelist support in GCC. 
(That's a DWARF2 features.)


It nicely works for:
module m
 real :: cc
end module m
subroutine sub()
  use m
  implicit none
  integer :: aa, bb
  namelist /nml/ aa, bb, cc
  aa = 5
end subroutine sub
which gives:
 24f: Abbrev Number: 3 (DW_TAG_namelist)
50   DW_AT_name: nml
54   DW_AT_sibling : 0x68
 358: Abbrev Number: 4 (DW_TAG_namelist_item)
59   DW_AT_namelist_items: 0x68
 35d: Abbrev Number: 4 (DW_TAG_namelist_item)
5e   DW_AT_namelist_items: 0x72
 362: Abbrev Number: 4 (DW_TAG_namelist_item)
63   DW_AT_namelist_items: 0xab
 268: Abbrev Number: 5 (DW_TAG_variable)
69   DW_AT_name: aa
etc.


However, it fails for:
 subroutine nml_test(x1)
integer :: x1
namelist /nml2/ x1
x1 = 5
 end subroutine nml_test

i.e. where x1 is a procedure argument. The problem is that 
lookup_decl_die cannot find it and force_decl_die doesn't like to handle it.


Any idea?

Tobias

PS: Except for gfortran.dg/namelist_14.f90, gfortran.dg/namelist_69.f90 
and gfortran.dg/namelist_70.f90 it regtested successfully for 
check-gfortran. (I didn't see a need for an all-language bootstrap + 
regtesting as long as there are still Fortran regressions.)
2013-04-09  Tobias Burnus  bur...@net-b.de

	PR fortran/37132
	* dbxout.c (dbx_debug_hooks, xcoff_debug_hooks): Add noop
	for namelist_decl.
	* sdbout.c (sdb_debug_hooks): Ditto.
	* vmsdbgout.c (vmsdbg_debug_hooks): Ditto.
	* debug.c (do_nothing_debug_hooks): Ditto.
	(debug_nothing_charstar_tree_vectree): New function.
	* debug.h (gcc_debug_hooks): Add namelist_decl.
	(debug_nothing_charstar_tree_vectree): New prototype.
	* dwarf2out.c (dwarf2out_namelist_decl): New function.
	(dwarf2_debug_hooks): Use it.

2013-04-09  Tobias Burnus  bur...@net-b.de

	PR fortran/37132
	* trans-decl.c (generate_namelist_decl): New function.
	(generate_local_vars): Call it.

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 4d9fc4e..9e5388c 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -367,6 +367,7 @@ const struct gcc_debug_hooks dbx_debug_hooks =
   dbxout_global_decl,		 /* global_decl */
   dbxout_type_decl,			 /* type_decl */
   debug_nothing_tree_tree_tree_bool,	 /* imported_module_or_decl */
+  debug_nothing_charstar_tree_vectree,	 /* namelist_decl */
   debug_nothing_tree,		 /* deferred_inline_function */
   debug_nothing_tree,		 /* outlining_inline_function */
   debug_nothing_rtx,		 /* label */
@@ -403,6 +404,7 @@ const struct gcc_debug_hooks xcoff_debug_hooks =
   dbxout_global_decl,		 /* global_decl */
   dbxout_type_decl,			 /* type_decl */
   debug_nothing_tree_tree_tree_bool,	 /* imported_module_or_decl */
+  debug_nothing_charstar_tree_vectree,	 /* namelist_decl */
   debug_nothing_tree,		 /* deferred_inline_function */
   debug_nothing_tree,		 /* outlining_inline_function */
   debug_nothing_rtx,		 /* label */
diff --git a/gcc/debug.c b/gcc/debug.c
index 7ed50b4..02106a5 100644
--- a/gcc/debug.c
+++ b/gcc/debug.c
@@ -46,6 +46,7 @@ const struct gcc_debug_hooks do_nothing_debug_hooks =
   debug_nothing_tree,		 /* global_decl */
   debug_nothing_tree_int,		 /* type_decl */
   debug_nothing_tree_tree_tree_bool,	 /* imported_module_or_decl */
+  debug_nothing_charstar_tree_vectree,	 /* namelist_decl */
   debug_nothing_tree,		 /* deferred_inline_function */
   debug_nothing_tree,		 /* outlining_inline_function */
   debug_nothing_rtx,		 /* label */
@@ -84,6 +85,11 @@ debug_nothing_tree_tree_tree_bool (tree t1 ATTRIBUTE_UNUSED,
 {
 }
 
+void
+debug_nothing_charstar_tree_vectree (const char *, tree, vectree *)
+{
+}
+
 bool
 debug_true_const_tree (const_tree block ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/debug.h b/gcc/debug.h
index 886de17..b429016 100644
--- a/gcc/debug.h
+++ b/gcc/debug.h
@@ -18,6 +18,8 @@
 #ifndef GCC_DEBUG_H
 #define GCC_DEBUG_H
 
+#include vec.h
+
 /* This structure contains hooks for the debug information output
functions, accessed through the global instance debug_hooks set in
toplev.c according to command line options.  */
@@ -108,6 +110,9 @@ struct gcc_debug_hooks
   void (* imported_module_or_decl) (tree decl, tree name,
 tree context, bool child);
 
+  /* Debug information for namelists.  */
+  void (* namelist_decl) (const char *, tree, vectree *);
+
   /* DECL is an inline function, whose body is present, but which is
  not being output at this point.  */
   void (* deferred_inline_function) (tree decl);
@@ -158,6 +163,8 @@ extern void debug_nothing_int_int (unsigned int, unsigned int);
 extern void debug_nothing_tree (tree);
 extern void debug_nothing_tree_tree (tree, tree);
 extern void debug_nothing_tree_int (tree, int);
+extern void debug_nothing_charstar_tree_vectree (const char *, tree,
+		 vectree *);
 extern void

Drop more of the old alias handling code

2013-04-09 Thread Jan Hubicka

Hi,
this patch removes handling of alias pairs in ipa.c.  This code is no longer 
needed
now when we represent all aliases explicitly in the symbol table.

Bootstrapped/regtested x86_64-linux.

Honza

* ipa.c (cgraph_externally_visible_p, varpool_externally_visible_p): 
Drop
aliased parameter.
(function_and_variable_visibility): Do not handle alias pairs.
* cgraph.c (varpool_externally_visible_p): Update prototype.
* varpool.c (varpool_add_new_variable): Update.
Index: ipa.c
===
--- ipa.c   (revision 197243)
+++ ipa.c   (working copy)
@@ -573,7 +573,7 @@ cgraph_comdat_can_be_unshared_p (struct
 
 static bool
 cgraph_externally_visible_p (struct cgraph_node *node,
-bool whole_program, bool aliased)
+bool whole_program)
 {
   if (!node-local.finalized)
 return false;
@@ -582,11 +582,6 @@ cgraph_externally_visible_p (struct cgra
  || DECL_EXTERNAL (node-symbol.decl)))
 return false;
 
-  /* Do not even try to be smart about aliased nodes.  Until we properly
- represent everything by same body alias, these are just evil.  */
-  if (aliased)
-return true;
-
   /* Do not try to localize built-in functions yet.  One of problems is that we
  end up mangling their asm for WHOPR that makes it impossible to call them
  using the implicit built-in declarations anymore.  Similarly this enables
@@ -638,7 +633,7 @@ cgraph_externally_visible_p (struct cgra
 /* Return true when variable VNODE should be considered externally visible.  */
 
 bool
-varpool_externally_visible_p (struct varpool_node *vnode, bool aliased)
+varpool_externally_visible_p (struct varpool_node *vnode)
 {
   /* Do not touch weakrefs; while they are not externally visible,
  dropping their DECL_EXTERNAL flags confuse most
@@ -652,11 +647,6 @@ varpool_externally_visible_p (struct var
   if (!DECL_COMDAT (vnode-symbol.decl)  !TREE_PUBLIC (vnode-symbol.decl))
 return false;
 
-  /* Do not even try to be smart about aliased nodes.  Until we properly
- represent everything by same body alias, these are just evil.  */
-  if (aliased)
-return true;
-
   /* If linker counts on us, we must preserve the function.  */
   if (symtab_used_from_object_file_p ((symtab_node) vnode))
 return true;
@@ -733,42 +723,9 @@ function_and_variable_visibility (bool w
 {
   struct cgraph_node *node;
   struct varpool_node *vnode;
-  struct pointer_set_t *aliased_nodes = pointer_set_create ();
-  struct pointer_set_t *aliased_vnodes = pointer_set_create ();
-  unsigned i;
-  alias_pair *p;
 
-  /* Discover aliased nodes.  */
-  FOR_EACH_VEC_SAFE_ELT (alias_pairs, i, p)
-{
-  if (dump_file)
-  fprintf (dump_file, Alias %s-%s,
-  IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (p-decl)),
-  IDENTIFIER_POINTER (p-target));
-   
-  if ((node = cgraph_node_for_asm (p-target)) != NULL
-   !DECL_EXTERNAL (node-symbol.decl))
-   {
- if (!node-analyzed)
-   continue;
- cgraph_mark_force_output_node (node);
- pointer_set_insert (aliased_nodes, node);
- if (dump_file)
-   fprintf (dump_file,   node %s/%i,
-cgraph_node_name (node), node-uid);
-   }
-  else if ((vnode = varpool_node_for_asm (p-target)) != NULL
-   !DECL_EXTERNAL (vnode-symbol.decl))
-   {
- vnode-symbol.force_output = 1;
- pointer_set_insert (aliased_vnodes, vnode);
- if (dump_file)
-   fprintf (dump_file,   varpool node %s,
-varpool_node_name (vnode));
-   }
-  if (dump_file)
-   fprintf (dump_file, \n);
-}
+  /* All aliases should be procssed at this point.  */
+  gcc_checking_assert (!alias_pairs || !alias_pairs-length());
 
   FOR_EACH_FUNCTION (node)
 {
@@ -817,9 +774,7 @@ function_and_variable_visibility (bool w
   !DECL_COMDAT (node-symbol.decl))
  || TREE_PUBLIC (node-symbol.decl)
  || DECL_EXTERNAL (node-symbol.decl));
-  if (cgraph_externally_visible_p (node, whole_program,
-  pointer_set_contains (aliased_nodes,
-node)))
+  if (cgraph_externally_visible_p (node, whole_program))
 {
  gcc_assert (!node-global.inlined_to);
  node-symbol.externally_visible = true;
@@ -898,9 +853,7 @@ function_and_variable_visibility (bool w
 {
   if (!vnode-finalized)
 continue;
-  if (varpool_externally_visible_p
-   (vnode, 
-pointer_set_contains (aliased_vnodes, vnode)))
+  if (varpool_externally_visible_p (vnode))
vnode-symbol.externally_visible = true;
   else
 vnode-symbol.externally_visible = false;
@@ -913,8 +866,6 @@ function_and_variable_visibility (bool w

Re: [PATCH SH] Error: unaligned opcodes detected in executable segment

2013-04-09 Thread Kaz Kojima

Christian Bruel christian.br...@st.com wrote:
 This patch fixes label alignments after a ADDR_DIFF_VEC with byte
 offsets. The bug occurs with building the libgcc for a sh-elf target.
 
 See a reduced case here. The funny thing is that the assembly error
 given in Subject appears only on a debug section when compiled with -g,
 thus the emitted code without -g:
 
  .align 2
 .L4:
 .byte   .L3-.L5
 .byte   .L10-.L5
 .byte   .L7-.L5
 .byte   .L8-.L5
 .byte   .L9-.L5
 .L3:
  mov.l   .L13,r1
 
 assembles silently, although unaligned...
 
 OK for trunk ? The failure occurs with the libgcc, so tested by allowing
 the sh-elf build to complete.

OK.

Regards,
kaz

69 matches

Mail list logo