[ping] Fix wrong code with boolean negation

2013-02-01 Thread Eric Botcazou
It's a regression (albeit an old one):
  http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01044.html

Thanks in advance.

-- 
Eric Botcazou


[PATCH][RFC] Fix PR56113 more

2013-02-01 Thread Richard Biener

This reduces compile-time of the testcase in PR56113 (with n = 4)
from 575s to 353s.  It does so by reducing the quadratic algorithm
to impose an order on visiting dominator sons during a domwalk.

Steven raises the issue that there exist domwalk users that modify
the CFG during the walk and thus the new scheme does not work
(at least optimally, as the current quadratic scheme does).  As
we are using a fixed-size sbitmap to track visited blocks existing
domwalks cannot add new blocks to the CFG so the worst thing that
can happen is that the order of dominator sons is no longer
optimal (I suppose with the right CFG manipulations even the
domwalk itself does not work - so I'd be hesitant to try to support
such domwalk users) - back to the state before any ordering
was imposed on the dom children visits (see rev 159100).

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?
(it's a regression from 4.5)

Thanks,
Richard.

2013-02-01  Richard Biener  rguent...@suse.de

PR middle-end/56113
* domwalk.c (bb_postorder): New global static.
(cmp_bb_postorder): New function.
(walk_dominator_tree): Replace scheme imposing an order for
visiting dominator sons by one sorting them at the time they
are pushed on the stack.

Index: gcc/domwalk.c
===
*** gcc/domwalk.c   (revision 195615)
--- gcc/domwalk.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 128,133 
--- 128,148 
  which is currently an abstraction over walking tree statements.  Thus
  the dominator walker is currently only useful for trees.  */
  
+ static int *bb_postorder;
+ 
+ static int
+ cmp_bb_postorder (const void *a, const void *b)
+ {
+   basic_block bb1 = *(basic_block *)const_castvoid *(a);
+   basic_block bb2 = *(basic_block *)const_castvoid *(b);
+   if (bb1-index == bb2-index)
+ return 0;
+   /* Place higher completion number first (pop off lower number first).  */
+   if (bb_postorder[bb1-index]  bb_postorder[bb2-index])
+ return -1;
+   return 1;
+ }
+ 
  /* Recursively walk the dominator tree.
  
 WALK_DATA contains a set of callbacks to perform pass-specific
*** walk_dominator_tree (struct dom_walk_dat
*** 143,151 
basic_block dest;
basic_block *worklist = XNEWVEC (basic_block, n_basic_blocks * 2);
int sp = 0;
!   sbitmap visited = sbitmap_alloc (last_basic_block + 1);
!   bitmap_clear (visited);
!   bitmap_set_bit (visited, ENTRY_BLOCK_PTR-index);
  
while (true)
  {
--- 158,174 
basic_block dest;
basic_block *worklist = XNEWVEC (basic_block, n_basic_blocks * 2);
int sp = 0;
!   int *postorder, postorder_num;
! 
!   if (walk_data-dom_direction == CDI_DOMINATORS)
! {
!   postorder = XNEWVEC (int, n_basic_blocks);
!   postorder_num = inverted_post_order_compute (postorder);
!   bb_postorder = XNEWVEC (int, last_basic_block);
!   for (int i = 0; i  postorder_num; ++i)
!   bb_postorder[postorder[i]] = i;
!   free (postorder);
! }
  
while (true)
  {
*** walk_dominator_tree (struct dom_walk_dat
*** 186,201 
  if (walk_data-before_dom_children)
(*walk_data-before_dom_children) (walk_data, bb);
  
- bitmap_set_bit (visited, bb-index);
- 
  /* Mark the current BB to be popped out of the recursion stack
 once children are processed.  */
  worklist[sp++] = bb;
  worklist[sp++] = NULL;
  
  for (dest = first_dom_son (walk_data-dom_direction, bb);
   dest; dest = next_dom_son (walk_data-dom_direction, dest))
worklist[sp++] = dest;
}
/* NULL is used to mark pop operations in the recursion stack.  */
while (sp  0  !worklist[sp - 1])
--- 209,233 
  if (walk_data-before_dom_children)
(*walk_data-before_dom_children) (walk_data, bb);
  
  /* Mark the current BB to be popped out of the recursion stack
 once children are processed.  */
  worklist[sp++] = bb;
  worklist[sp++] = NULL;
  
+ int saved_sp = sp;
  for (dest = first_dom_son (walk_data-dom_direction, bb);
   dest; dest = next_dom_son (walk_data-dom_direction, dest))
worklist[sp++] = dest;
+ if (walk_data-dom_direction == CDI_DOMINATORS)
+   switch (sp - saved_sp)
+ {
+ case 0:
+ case 1:
+   break;
+ default:
+   qsort (worklist[saved_sp], sp - saved_sp,
+  sizeof (basic_block), cmp_bb_postorder);
+ }
}
/* NULL is used to mark pop operations in the recursion stack.  */
while (sp  0  !worklist[sp - 1])
*** walk_dominator_tree (struct dom_walk_dat
*** 217,260 
}
}
if (sp)
!   {
! int spp;
!

Re: [patch] Fix wrong code with boolean negation

2013-02-01 Thread Richard Biener
On Mon, Jan 21, 2013 at 9:41 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 this is a regression present in the Ada compiler since 4.5: the issue had been
 latent for ages, but an unrelated streamlining of the IR made it appear.

 When make_range_step is invoked on:

   (integer)!b  0

 where b is a boolean, it returns always true instead of always false.

 The sequence is as follows:

   (integer)!b  0   is_true_if   not in [0:0]
   (integer)!b   is_true_if   not in [0;+inf[
   !bis_true_if   not in [0;255]
   b is_true_if   in [0;255]

 The wrong step is the last one: when TRUTH_NOT_EXPR is seen in make_range_step
 the in value is unconditionally toggled.  Of course that doesn't work in the
 general case, just if the range is the boolean range.  As a matter of fact,
 this is explained just below for the comparison operators:

   /* We can only do something if the range is testing for zero
  and if the second operand is an integer constant.  Note that
  saying something is in the range we make is done by
  complementing IN_P since it will set in the initial case of
  being not equal to zero; out is leaving it alone.  */

 so the fix is to use the zero range condition in the TRUTH_NOT_EXPR case.

 Tested on x86_64-suse-linux, OK for mainline?  And for branch(es)?

Ok everywhere.

Thanks,
Richard.


 2013-01-21  Eric Botcazou  ebotca...@adacore.com

 * fold-const.c (make_range_step) TRUTH_NOT_EXPR: Bail out if the
 range isn't testing for zero.


 2013-01-21  Eric Botcazou  ebotca...@adacore.com

 * gnat.dg/opt26.adb: New test.


 --
 Eric Botcazou


Re: [PATCH][RFC] Fix PR56113 more

2013-02-01 Thread Jakub Jelinek
On Fri, Feb 01, 2013 at 10:00:00AM +0100, Richard Biener wrote:
 
 This reduces compile-time of the testcase in PR56113 (with n = 4)
 from 575s to 353s.  It does so by reducing the quadratic algorithm
 to impose an order on visiting dominator sons during a domwalk.
 
 Steven raises the issue that there exist domwalk users that modify
 the CFG during the walk and thus the new scheme does not work
 (at least optimally, as the current quadratic scheme does).  As
 we are using a fixed-size sbitmap to track visited blocks existing
 domwalks cannot add new blocks to the CFG so the worst thing that
 can happen is that the order of dominator sons is no longer
 optimal (I suppose with the right CFG manipulations even the
 domwalk itself does not work - so I'd be hesitant to try to support
 such domwalk users) - back to the state before any ordering
 was imposed on the dom children visits (see rev 159100).

I think it would be desirable to first analyze the failures Steven saw, if
any.  As you said, asan doesn't use domwalk, so it is a mystery to me.

Jakub


[PATCH] More PR56113 PTA speedups

2013-02-01 Thread Richard Biener

This reduces the work done for single predecessor nodes for
assigning pointer equivalence classes in label_visit.  For
the PR56113 testcase with n = 4 this reduces PTA time from

 tree PTA: 119.59 (34%) usr

to

 tree PTA:  51.62 (18%) usr

(the percentages are with the domwalk fix applied).

Easy optimization with a surprisingly big effect.  On the way
I also reduce the work done for non-direct nodes.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2013-02-01  Richard Biener  rguent...@suse.de

* tree-ssa-structalias.c (label_visit): Reduce work for
single-predecessor nodes.

Index: gcc/tree-ssa-structalias.c
===
*** gcc/tree-ssa-structalias.c  (revision 195641)
--- gcc/tree-ssa-structalias.c  (working copy)
*** condense_visit (constraint_graph_t graph
*** 2107,2120 
  static void
  label_visit (constraint_graph_t graph, struct scc_info *si, unsigned int n)
  {
!   unsigned int i;
bitmap_iterator bi;
-   bitmap_set_bit (si-visited, n);
  
!   if (!graph-points_to[n])
! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack);
  
/* Label and union our incoming edges's points to sets.  */
EXECUTE_IF_IN_NONNULL_BITMAP (graph-preds[n], 0, i, bi)
  {
unsigned int w = si-node_mapping[i];
--- 2108,2120 
  static void
  label_visit (constraint_graph_t graph, struct scc_info *si, unsigned int n)
  {
!   unsigned int i, first_pred;
bitmap_iterator bi;
  
!   bitmap_set_bit (si-visited, n);
  
/* Label and union our incoming edges's points to sets.  */
+   first_pred = -1U;
EXECUTE_IF_IN_NONNULL_BITMAP (graph-preds[n], 0, i, bi)
  {
unsigned int w = si-node_mapping[i];
*** label_visit (constraint_graph_t graph, s
*** 2126,2136 
continue;
  
if (graph-points_to[w])
!   bitmap_ior_into(graph-points_to[n], graph-points_to[w]);
  }
!   /* Indirect nodes get fresh variables.  */
if (!bitmap_bit_p (graph-direct_nodes, n))
! bitmap_set_bit (graph-points_to[n], FIRST_REF_NODE + n);
  
if (!bitmap_empty_p (graph-points_to[n]))
  {
--- 2126,2170 
continue;
  
if (graph-points_to[w])
!   {
! if (first_pred == -1U)
!   first_pred = w;
! else if (!graph-points_to[n])
!   {
! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack);
! bitmap_ior (graph-points_to[n],
! graph-points_to[first_pred], graph-points_to[w]);
!   }
! else
!   bitmap_ior_into(graph-points_to[n], graph-points_to[w]);
!   }
  }
! 
!   /* Indirect nodes get fresh variables and a new pointer equiv class.  */
if (!bitmap_bit_p (graph-direct_nodes, n))
! {
!   if (!graph-points_to[n])
!   {
! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack);
! if (first_pred != -1U)
!   bitmap_copy (graph-points_to[n], graph-points_to[first_pred]);
!   }
!   bitmap_set_bit (graph-points_to[n], FIRST_REF_NODE + n);
!   graph-pointer_label[n] = pointer_equiv_class++;
!   return;
! }
! 
!   /* If there was only a single non-empty predecessor the pointer equiv
!  class is the same.  */
!   if (!graph-points_to[n])
! {
!   if (first_pred != -1U)
!   {
! graph-pointer_label[n] = graph-pointer_label[first_pred];
! graph-points_to[n] = graph-points_to[first_pred];
!   }
!   return;
! }
  
if (!bitmap_empty_p (graph-points_to[n]))
  {



Re: [PATCH][RFC] Fix PR56113 more

2013-02-01 Thread Richard Biener
On Fri, 1 Feb 2013, Jakub Jelinek wrote:

 On Fri, Feb 01, 2013 at 10:00:00AM +0100, Richard Biener wrote:
  
  This reduces compile-time of the testcase in PR56113 (with n = 4)
  from 575s to 353s.  It does so by reducing the quadratic algorithm
  to impose an order on visiting dominator sons during a domwalk.
  
  Steven raises the issue that there exist domwalk users that modify
  the CFG during the walk and thus the new scheme does not work
  (at least optimally, as the current quadratic scheme does).  As
  we are using a fixed-size sbitmap to track visited blocks existing
  domwalks cannot add new blocks to the CFG so the worst thing that
  can happen is that the order of dominator sons is no longer
  optimal (I suppose with the right CFG manipulations even the
  domwalk itself does not work - so I'd be hesitant to try to support
  such domwalk users) - back to the state before any ordering
  was imposed on the dom children visits (see rev 159100).
 
 I think it would be desirable to first analyze the failures Steven saw, if
 any.  As you said, asan doesn't use domwalk, so it is a mystery to me.

Yeah.  Now, fortunately domwalk.h is only directly included and thus
the set of optimizers using it are

compare-elim.c:#include domwalk.h
domwalk.c:#include domwalk.h
fwprop.c:#include domwalk.h
gimple-ssa-strength-reduction.c:#include domwalk.h
graphite-sese-to-poly.c:#include domwalk.h
tree-into-ssa.c:#include domwalk.h
tree-ssa-dom.c:#include domwalk.h
tree-ssa-dse.c:#include domwalk.h
tree-ssa-loop-im.c:#include domwalk.h
tree-ssa-math-opts.c:   If we did this using domwalk.c, an efficient 
implementation would have
tree-ssa-phiopt.c:#include domwalk.h
tree-ssa-pre.c:#include domwalk.h
tree-ssa-pre.c:/* Local state for the eliminate domwalk.  */
tree-ssa-pre.c:   eliminate domwalk.  */
tree-ssa-pre.c:/* At the current point of the eliminate domwalk make OP 
available.  */
tree-ssa-pre.c:/* Perform elimination for the basic-block B during the 
domwalk.  */
tree-ssa-strlen.c:#include domwalk.h
tree-ssa-uncprop.c:#include domwalk.h

I don't see any target specific ones that do not have coverage
with x86_64 multilib testing (maybe compare-elim.c?  though that
doesn't really require a domwalk as it is only using the
before_dom_children hook).  That said, arbitrary CFG manipulations
during domwalk certainly will not preserve domwalk properties
of a domwalk.

Steven - can you reproduce your failures (and on which target?)

Thanks,
Richard.


[PATCH, AArch64] AND operation should use CC_NZ mode

2013-02-01 Thread Ian Bolton
The mode for AND should really be CC_NZ, so I fixed that up and in the TST
patterns that (erroneously) expected it to be CC mode.

It has been tested on linux and bare-metal.

OK to commit to trunk (as bug fix)?

Thanks.
Ian



13-02-01  Ian Bolton  ian.bol...@arm.com

* config/aarch64/aarch64.c (aarch64_select_cc_mode): Return correct
CC mode for AND.
* config/aarch64/aarch64.md (*andmode3nr_compare0): Fixed to use
CC_NZ.
(*and_SHIFT:optabmode3nr_compare0): Likewise.


-

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 03b1361..2b09669 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3076,7 +3076,7 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
   if ((GET_MODE (x) == SImode || GET_MODE (x) == DImode)
y == const0_rtx
(code == EQ || code == NE || code == LT || code == GE)
-   (GET_CODE (x) == PLUS || GET_CODE (x) == MINUS))
+   (GET_CODE (x) == PLUS || GET_CODE (x) == MINUS || GET_CODE (x) ==
AND))
 return CC_NZmode;
 
   /* A compare with a shifted operand.  Because of canonicalization,
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 36267c9..c4c152f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2470,8 +2470,8 @@
 )
 
 (define_insn *andmode3nr_compare0
-  [(set (reg:CC CC_REGNUM)
-   (compare:CC
+  [(set (reg:CC_NZ CC_REGNUM)
+   (compare:CC_NZ
 (and:GPI (match_operand:GPI 0 register_operand %r,r)
  (match_operand:GPI 1 aarch64_logical_operand
r,lconst))
 (const_int 0)))]
@@ -2481,8 +2481,8 @@
(set_attr mode MODE)])
 
 (define_insn *and_SHIFT:optabmode3nr_compare0
-  [(set (reg:CC CC_REGNUM)
-   (compare:CC
+  [(set (reg:CC_NZ CC_REGNUM)
+   (compare:CC_NZ
 (and:GPI (SHIFT:GPI
   (match_operand:GPI 0 register_operand r)
   (match_operand:QI 1 aarch64_shift_imm_mode n))





Re: [PATCH, AArch64] AND operation should use CC_NZ mode

2013-02-01 Thread Marcus Shawcroft

On 01/02/13 11:05, Ian Bolton wrote:

The mode for AND should really be CC_NZ, so I fixed that up and in the TST
patterns that (erroneously) expected it to be CC mode.

It has been tested on linux and bare-metal.

OK to commit to trunk (as bug fix)?

Thanks.
Ian



13-02-01  Ian Bolton  ian.bol...@arm.com

* config/aarch64/aarch64.c (aarch64_select_cc_mode): Return correct
CC mode for AND.
* config/aarch64/aarch64.md (*andmode3nr_compare0): Fixed to use
CC_NZ.
(*and_SHIFT:optabmode3nr_compare0): Likewise.



OK and backport to ARM/aarch64-4.7-branch please.

Thanks
/Marcus




{PATCH,x86] Workarond for 55970

2013-02-01 Thread Yuri Rumyantsev
Hi All,

This is simple fix that is aimed to help users in porting their
applications to x86 platforms which rely on an order of function
argument evaluation. To preserve direct order of argument evaluation
they need to be added additional option '-mno-push-args' to compile
that looks reasonable price for non-C/C++ Standard conformance.
I checked that adding this option does not affect on performance on
Corei7 and Atom platforms. Note also that option -push-args is
passed by default on all x86 platforms and it means that changing this
macros will not likely affect on almost all gcc users.

Is it OK for trunk?

2013-02-01  Yuri Rumyantsev  ysrum...@gmail.com

* config/i386/i386.h : Change macros PUSH_ARGS_REVERSED.


patch
Description: Binary data


[PATCH] Fix PR56168

2013-02-01 Thread Richard Biener

So - back to PR55848 - this testcase shows that we still handle
builtins vs. non-builtins in a wrong way.  If at compile-time
we chose to use a non-builtin variant we have to preserve that
(-fno-builtin) - easy to do at WPA stage by adjusting symbol
merging.  Now, at LTRANS stage somebody clever decided to
skip symtab merging - but forgot to disable decl fixup.  In this
case decl fixup with un-merged builtin and non-builtin symtab
entry choses the builtin if it happens to be first in the list
of asm aliases.

The following patch fixes both issues, preserving runtime
behavior of the weird testcase and avoiding the folding to
cbrt as in the original report.

LTO bootstrap and testing running on x86_64-unknown-linux-gnu
(in stage3 already).  Honza, does this look sane?

Thanks,
Richard.

2013-02-01  Richard Guenther  rguent...@suse.de

PR lto/56168
* lto-symtab.c (lto_symtab_merge_decls_1): Make non-builtin
node prevail as last resort.
(lto_symtab_merge_decls): Remove guard on LTRANS here.
(lto_symtab_prevailing_decl): Builtins are their own prevailing
decl.

lto/
* lto.c (read_cgraph_and_symbols): Do not call lto_symtab_merge_decls
or lto_fixup_decls at LTRANS time.

* gcc.dg/lto/pr56168_0.c: New testcase.
* gcc.dg/lto/pr56168_1.c: Likewise.

Index: gcc/lto-symtab.c
===
*** gcc/lto-symtab.c(revision 195641)
--- gcc/lto-symtab.c(working copy)
*** lto_symtab_merge_decls_1 (symtab_node fi
*** 439,450 
 COMPLETE_TYPE_P (TREE_TYPE (e-symbol.decl)))
  prevailing = e;
}
!   /* For variables prefer the builtin if one is available.  */
else if (TREE_CODE (prevailing-symbol.decl) == FUNCTION_DECL)
{
  for (e = first; e; e = e-symbol.next_sharing_asm_name)
if (TREE_CODE (e-symbol.decl) == FUNCTION_DECL
!DECL_BUILT_IN (e-symbol.decl))
  {
prevailing = e;
break;
--- 439,450 
 COMPLETE_TYPE_P (TREE_TYPE (e-symbol.decl)))
  prevailing = e;
}
!   /* For variables prefer the non-builtin if one is available.  */
else if (TREE_CODE (prevailing-symbol.decl) == FUNCTION_DECL)
{
  for (e = first; e; e = e-symbol.next_sharing_asm_name)
if (TREE_CODE (e-symbol.decl) == FUNCTION_DECL
!!DECL_BUILT_IN (e-symbol.decl))
  {
prevailing = e;
break;
*** lto_symtab_merge_decls (void)
*** 507,518 
  {
symtab_node node;
  
-   /* In ltrans mode we read merged cgraph, we do not really need to care
-  about resolving symbols again, we only need to replace duplicated 
declarations
-  read from the callgraph and from function sections.  */
-   if (flag_ltrans)
- return;
- 
/* Populate assembler name hash.   */
symtab_initialize_asm_name_hash ();
  
--- 507,512 
*** lto_symtab_prevailing_decl (tree decl)
*** 598,603 
--- 592,602 
if (TREE_CODE (decl) == FUNCTION_DECL  DECL_ABSTRACT (decl))
  return decl;
  
+   /* Likewise builtins are their own prevailing decl.  This preserves
+  non-builtin vs. builtin uses from compile-time.  */
+   if (TREE_CODE (decl) == FUNCTION_DECL  DECL_BUILT_IN (decl))
+ return decl;
+ 
/* Ensure DECL_ASSEMBLER_NAME will not set assembler name.  */
gcc_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
  
Index: gcc/lto/lto.c
===
*** gcc/lto/lto.c   (revision 195641)
--- gcc/lto/lto.c   (working copy)
*** read_cgraph_and_symbols (unsigned nfiles
*** 3033,3048 
  fprintf (stderr, Merging declarations\n);
  
timevar_push (TV_IPA_LTO_DECL_MERGE);
!   /* Merge global decls.  */
!   lto_symtab_merge_decls ();
  
!   /* If there were errors during symbol merging bail out, we have no
!  good way to recover here.  */
!   if (seen_error ())
! fatal_error (errors during merging of translation units);
  
!   /* Fixup all decls.  */
!   lto_fixup_decls (all_file_decl_data);
htab_delete (tree_with_vars);
tree_with_vars = NULL;
ggc_collect ();
--- 3033,3054 
  fprintf (stderr, Merging declarations\n);
  
timevar_push (TV_IPA_LTO_DECL_MERGE);
!   /* Merge global decls.  In ltrans mode we read merged cgraph, we do not
!  need to care about resolving symbols again, we only need to replace
!  duplicated declarations read from the callgraph and from function
!  sections.  */
!   if (!flag_ltrans)
! {
!   lto_symtab_merge_decls ();
  
!   /* If there were errors during symbol merging bail out, we have no
!good way to recover here.  */
!   if (seen_error ())
!   fatal_error (errors during merging of translation units);
  
!   /* Fixup all decls.  */

Re: [Patch, AArch64, AArch64-4.7] Backport Optimize cmp in some cases patch

2013-02-01 Thread Marcus Shawcroft

On 27/01/13 08:46, Venkataramanan Kumar wrote:

Hi Maintainers,

The attached patch backports the gcc trunk patch
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00143.html to
ARM/aarch64-4.7-branch branch.

ChangeLog.aarch64

2013-01-27 Venkataramanan Kumar  venkataramanan.ku...@linaro.org

 Backport from mainline.
 2013-01-04  Andrew Pinski  apin...@cavium.com

 * config/aarch64/aarch64.c (aarch64_fixed_condition_code_regs):
 New function.
 (TARGET_FIXED_CONDITION_CODE_REGS): Define

Path is attached. Please let me know if I can change -1 to
INVALID_REGNUM and commit.

Built gcc and tested the gcc testsuites for the aarch64-none-elf
target with ARMv8 Foundation model. No new regressions.

Ok to for the ARM/aarch64-4.7-branch ?



This is fine. Thank you. Please commit.

/Marcus



[PATCH,committed] Define ASM_OUTPUT_ALIGNED_LOCAL for AIX

2013-02-01 Thread David Edelsohn
AIX 6.1 added an alignment argument to the .lcomm pseudo-op.  This
fixes many of the remaining Altivec failures on AIX where GCC was
generating a zero vector in BSS, but the block was not appropriately
aligned.

I also took the opportunity to change ASM_OUTPUT_ALIGNED_COMMON use of
exact_log2 to floor_log2.

* config/rs6000/xcoff.h (ASM_OUTPUT_ALIGNED_COMMON): Use floor_log2.
(ASM_OUTPUT_ALIGNED_LOCAL): New.

Bootstrapped on powerpc-ibm-aix7.1.0.0.

Thanks, David

Index: xcoff.h
===
--- xcoff.h (revision 195639)
+++ xcoff.h (working copy)
@@ -283,7 +283,7 @@
RS6000_OUTPUT_BASENAME ((FILE), (NAME));\
if ((ALIGN)  32)   \
 fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%u\n, (SIZE), \
- exact_log2 ((ALIGN) / BITS_PER_UNIT)); \
+ floor_log2 ((ALIGN) / BITS_PER_UNIT)); \
else if ((SIZE)  4)\
  fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,3\n, (SIZE)); \
else\
@@ -292,12 +292,30 @@

 /* This says how to output an assembler line
to define a local common symbol.
-   Alignment cannot be specified, but we can try to maintain
+   The assembler in AIX 6.1 and later supports an alignment argument.
+   For earlier releases of AIX, we try to maintain
alignment after preceding TOC section if it was aligned
for 64-bit mode.  */

 #define LOCAL_COMMON_ASM_OP \t.lcomm 

+#if TARGET_AIX_VERSION = 61
+#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)  \
+  do { fputs (LOCAL_COMMON_ASM_OP, (FILE));\
+   RS6000_OUTPUT_BASENAME ((FILE), (NAME));\
+   if ((ALIGN)  32)   \
+fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s,%u\n,\
+ (SIZE), xcoff_bss_section_name,   \
+ floor_log2 ((ALIGN) / BITS_PER_UNIT));\
+   else if ((SIZE)  4)\
+fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s,3\n, \
+ (SIZE), xcoff_bss_section_name);  \
+   else\
+fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s\n,   \
+ (SIZE), xcoff_bss_section_name);  \
+ } while (0)
+#endif
+
 #define ASM_OUTPUT_LOCAL(FILE, NAME, SIZE, ROUNDED)\
   do { fputs (LOCAL_COMMON_ASM_OP, (FILE));\
RS6000_OUTPUT_BASENAME ((FILE), (NAME));\


[committed] Backports from trunk to 4.7 branch

2013-02-01 Thread Jakub Jelinek
Hi!

I've committed following backports from trunk to 4.7 branch, after
bootstrapping/regtesting it on x86_64-linux and i686-linux.

Jakub
2013-02-01  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2012-11-13  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/54127
* cfgrtl.c (force_nonfallthru_and_redirect): When redirecting
asm goto labels from BB_HEAD (e-dest) to target bb, decrement
LABEL_NUSES of BB_HEAD (e-dest) and increment LABEL_NUSES of
BB_HEAD (target) appropriately and adjust JUMP_LABEL and/or
REG_LABEL_TARGET and REG_LABEL_OPERAND.

* gcc.dg/torture/pr54127.c: New test.

--- gcc/cfgrtl.c(revision 193469)
+++ gcc/cfgrtl.c(revision 193470)
@@ -1424,14 +1424,46 @@ force_nonfallthru_and_redirect (edge e,
(note = extract_asm_operands (PATTERN (BB_END (e-src)
 {
   int i, n = ASM_OPERANDS_LABEL_LENGTH (note);
+  bool adjust_jump_target = false;
 
   for (i = 0; i  n; ++i)
{
  if (XEXP (ASM_OPERANDS_LABEL (note, i), 0) == BB_HEAD (e-dest))
-   XEXP (ASM_OPERANDS_LABEL (note, i), 0) = block_label (target);
+   {
+ LABEL_NUSES (XEXP (ASM_OPERANDS_LABEL (note, i), 0))--;
+ XEXP (ASM_OPERANDS_LABEL (note, i), 0) = block_label (target);
+ LABEL_NUSES (XEXP (ASM_OPERANDS_LABEL (note, i), 0))++;
+ adjust_jump_target = true;
+   }
  if (XEXP (ASM_OPERANDS_LABEL (note, i), 0) == BB_HEAD (target))
asm_goto_edge = true;
}
+  if (adjust_jump_target)
+   {
+ rtx insn = BB_END (e-src), note;
+ rtx old_label = BB_HEAD (e-dest);
+ rtx new_label = BB_HEAD (target);
+
+ if (JUMP_LABEL (insn) == old_label)
+   {
+ JUMP_LABEL (insn) = new_label;
+ note = find_reg_note (insn, REG_LABEL_TARGET, new_label);
+ if (note)
+   remove_note (insn, note);
+   }
+ else
+   {
+ note = find_reg_note (insn, REG_LABEL_TARGET, old_label);
+ if (note)
+   remove_note (insn, note);
+ if (JUMP_LABEL (insn) != new_label
+  !find_reg_note (insn, REG_LABEL_TARGET, new_label))
+   add_reg_note (insn, REG_LABEL_TARGET, new_label);
+   }
+ while ((note = find_reg_note (insn, REG_LABEL_OPERAND, old_label))
+!= NULL_RTX)
+   XEXP (note, 0) = new_label;
+   }
 }
 
   if (EDGE_COUNT (e-src-succs) = 2 || abnormal_edge_flags || asm_goto_edge)
--- gcc/testsuite/gcc.dg/torture/pr54127.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr54127.c  (revision 193470)
@@ -0,0 +1,16 @@
+/* PR rtl-optimization/54127 */
+/* { dg-do compile } */
+
+extern void foo (void) __attribute__ ((__noreturn__));
+
+void
+bar (int x)
+{
+  if (x  0)
+foo ();
+  if (x == 0)
+return;
+  __asm goto (# %l[lab] %l[lab2] : : : : lab, lab2);
+lab:;
+lab2:;
+}
2013-02-01  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2012-11-17  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/55236
* fold-const.c (make_range_step) case NEGATE_EXPR: For -fwrapv
and signed ARG0_TYPE, force low and high to be non-NULL.

* gcc.dg/pr55236.c: New test.

--- gcc/fold-const.c(revision 193590)
+++ gcc/fold-const.c(revision 193591)
@@ -3880,6 +3880,17 @@ make_range_step (location_t loc, enum tr
   return arg0;
 
 case NEGATE_EXPR:
+  /* If flag_wrapv and ARG0_TYPE is signed, make sure
+low and high are non-NULL, then normalize will DTRT.  */
+  if (!TYPE_UNSIGNED (arg0_type)
+  !TYPE_OVERFLOW_UNDEFINED (arg0_type))
+   {
+ if (low == NULL_TREE)
+   low = TYPE_MIN_VALUE (arg0_type);
+ if (high == NULL_TREE)
+   high = TYPE_MAX_VALUE (arg0_type);
+   }
+
   /* (-x) IN [a,b] - x in [-b, -a]  */
   n_low = range_binop (MINUS_EXPR, exp_type,
   build_int_cst (exp_type, 0),
--- gcc/testsuite/gcc.dg/pr55236.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr55236.c  (revision 193591)
@@ -0,0 +1,31 @@
+/* PR tree-optimization/55236 */
+/* { dg-do run } */
+/* { dg-options -O2 -fwrapv } */
+
+extern void abort ();
+
+__attribute__((noinline, noclone)) void
+foo (int i)
+{
+  if (i  0)
+abort ();
+  i = -i;
+  if (i  0)
+return;
+  abort ();
+}
+
+__attribute__((noinline, noclone)) void
+bar (int i)
+{
+  if (i  0 || (-i) = 0)
+abort ();
+}
+
+int
+main ()
+{
+  foo (-__INT_MAX__ - 1);
+  bar (-__INT_MAX__ - 1);
+  return 0;
+}
2013-02-01  Jakub Jelinek  ja...@redhat.com

Backported from mainline
2012-11-20  Jakub Jelinek  ja...@redhat.com

PR middle-end/55094
* builtins.c (expand_builtin_trap): Add REG_ARGS_SIZE note
on the trap insn for !ACCUMULATE_OUTGOING_ARGS.
* cfgcleanup.c 

Re: {PATCH,x86] Workarond for 55970

2013-02-01 Thread Ian Lance Taylor
On Fri, Feb 1, 2013 at 5:10 AM, Yuri Rumyantsev ysrum...@gmail.com wrote:

 This is simple fix that is aimed to help users in porting their
 applications to x86 platforms which rely on an order of function
 argument evaluation. To preserve direct order of argument evaluation
 they need to be added additional option '-mno-push-args' to compile
 that looks reasonable price for non-C/C++ Standard conformance.
 I checked that adding this option does not affect on performance on
 Corei7 and Atom platforms. Note also that option -push-args is
 passed by default on all x86 platforms and it means that changing this
 macros will not likely affect on almost all gcc users.

If your goal is to preserve the order in which function arguments are
evaluated, this patch is not going to be reliable.  It only affects
the conversion from GIMPLE to RTL.  The GIMPLE optimizers will have
already had plenty of opportunity to change the function argument
evaluation order.

I don't think we should change the compiler to generate less efficient
code in order to help non-standard-conforming programs when the change
won't work reliably anyhow.

Ian


[RFC,PATCH] __cxa_atexit support for AIX (v2)

2013-02-01 Thread David Edelsohn
Richard Stallman has given permission to include code derived from GNU
C Library in libgcc for AIX using the GCC Runtime Exception license.

The updated patch is appended.  The GNU C Library code (cxa_atexit.c,
cxa_finalize.c, exit.h) is modified, so I am not exactly certain if my
reference to the GNU C Library origin is correct.

This has been bootstrapped on powerpc-ibm-aix7.1.0.0 using --enable-cxa_atexit.

Any comments, especially about the header for the files derived from
GNU C Library?

Thanks, David

libgcc/
* config.host (powerpc-ibm-aix[56789]): Add t-aix-cxa to tmake_file.
Add crtcxa to extra_parts.
* config/rs6000/exit.h: New file.
* config/rs6000/cxa_atexit.c: New file.
* config/rs6000/cxa_finalize.c: New file.
* config/rs6000/crtcxa.c: New file.
* config/rs6000/t-aix-cxa: New file.
* config/rs6000/libgcc-aix-cxa.ver: New file.

gcc/
* config/rs6000/aix61.h (STARTFILE_SPEC): Add crtcxa.

Index: libgcc/config.host
===
--- libgcc/config.host  (revision 195639)
+++ libgcc/config.host  (working copy)
@@ -899,7 +899,8 @@
;;
 rs6000-ibm-aix[56789].* | powerpc-ibm-aix[56789].*)
md_unwind_header=rs6000/aix-unwind.h
-   tmake_file=t-fdpbit rs6000/t-ppc64-fp rs6000/t-slibgcc-aix
rs6000/t-ibm-ldouble
+   tmake_file=t-fdpbit rs6000/t-ppc64-fp rs6000/t-slibgcc-aix
rs6000/t-ibm-ldouble rs6000/t-aix-cxa
+   extra_parts=crtcxa.o crtcxa_s.o
;;
 rl78-*-elf)
tmake_file=$tm_file t-fdpbit rl78/t-rl78
Index: libgcc/config/rs6000/exit.h
===
--- libgcc/config/rs6000/exit.h (revision 0)
+++ libgcc/config/rs6000/exit.h (revision 0)
@@ -0,0 +1,92 @@
+/* Copyright (C) 1991-2013 Free Software Foundation, Inc.
+
+Derived from exit.h in GNU C Library.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+http://www.gnu.org/licenses/.  */
+
+#ifndef_EXIT_H
+#define _EXIT_H 1
+
+#define attribute_hidden
+#define INTDEF(name)
+
+#include stdbool.h
+#include stdint.h
+
+enum
+{
+  ef_free, /* `ef_free' MUST be zero!  */
+  ef_us,
+  ef_on,
+  ef_at,
+  ef_cxa
+};
+
+struct exit_function
+  {
+/* `flavour' should be of type of the `enum' above but since we need
+   this element in an atomic operation we have to use `long int'.  */
+long int flavor;
+union
+  {
+   void (*at) (void);
+   struct
+ {
+   void (*fn) (int status, void *arg);
+   void *arg;
+ } on;
+   struct
+ {
+   void (*fn) (void *arg, int status);
+   void *arg;
+   void *dso_handle;
+ } cxa;
+  } func;
+  };
+struct exit_function_list
+  {
+struct exit_function_list *next;
+size_t idx;
+struct exit_function fns[32];
+  };
+extern struct exit_function_list *__exit_funcs attribute_hidden;
+extern struct exit_function_list *__quick_exit_funcs attribute_hidden;
+
+extern struct exit_function *__new_exitfn (struct exit_function_list **listp);
+extern uint64_t __new_exitfn_called attribute_hidden;
+
+extern void __run_exit_handlers (int status, struct exit_function_list **listp,
+bool run_list_atexit)
+  attribute_hidden __attribute__ ((__noreturn__));
+
+extern int __internal_atexit (void (*func) (void *), void *arg, void *d,
+ struct exit_function_list **listp)
+  attribute_hidden;
+extern int __cxa_at_quick_exit (void (*func) (void *), void *d);
+
+extern int __cxa_atexit (void (*func) (void *), void *arg, void *d);
+extern int __cxa_atexit_internal (void (*func) (void *), void *arg, void *d)
+ attribute_hidden;
+
+extern void __cxa_finalize (void *d);
+
+#endif /* exit.h  */
Index: libgcc/config/rs6000/cxa_atexit.c
===
--- libgcc/config/rs6000/cxa_atexit.c   (revision 0)
+++ libgcc/config/rs6000/cxa_atexit.c   (revision 0)
@@ -0,0 +1,130 @@
+/* Copyright (C) 1999-2013 Free Software Foundation, Inc.
+
+Derived from cxa_atexit.c in GNU C Library.
+
+This file 

Re: [PATCH] If possible, include range of profile hunk before prologue in .debug_loc ranges (PR debug/54793)

2013-02-01 Thread Richard Henderson

On 01/31/2013 02:02 AM, Jakub Jelinek wrote:

2013-01-31  Jakub Jelinekja...@redhat.com

PR debug/54793
* final.c (need_profile_function): New variable.
(final_start_function): Drop ATTRIBUTE_UNUSED from first argument.
If first of NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG
is only preceeded by NOTE_INSN_VAR_LOCATION or NOTE_INSN_DELETED
notes, targetm.asm_out.function_prologue doesn't emit anything,
HAVE_prologue and profiler should be emitted before prologue,
set need_profile_function instead of emitting it.
(final_scan_insn): If need_profile_function, emit
profile_function on the first NOTE_INSN_BASIC_BLOCK or
NOTE_INSN_FUNCTION_BEG note.



Ok.


r~


Re: [PATCH] Vtable pointer verification, C++ front end changes (patch 1 of 3)

2013-02-01 Thread Jason Merrill

On 01/31/2013 07:24 PM, Caroline Tice wrote:

On Wed, Jan 30, 2013 at 9:26 AM, Jason Merrill ja...@redhat.com wrote:

@@ -17954,6 +17954,10 @@ mark_class_instantiated (tree t, int ext
+  if (flag_vtable_verify)
+vtv_save_class_info (t);


Why do you need this here as well as in finish_struct_1?


If we don't have this in both places, then we miss getting vtable
pointers for instantiated templates.


Why?  instantiated templates also go through finish_struct_1.  And we 
only hit this function for explicit instantiations, not implicit.



+  base_id = DECL_ASSEMBLER_NAME (TREE_CHAIN (base_class));


I think you want TYPE_LINKAGE_IDENTIFIER here.


I don't know the difference between DECL_ASSEMBLER_NAME and
TYPE_LINKAGE_IDENTIFIER.  We are just trying to get the mangled name
for the class.


Ah, I guess you don't want TYPE_LINKAGE_IDENTIFIER, as that's the simple 
name rather than the mangled one.  But for the external name you always 
want to look at TYPE_NAME, not TREE_CHAIN (which corresponds to 
TYPE_STUB_DECL); in the case of an anonymous class that gets a name for 
linkage purposes from a typedef, the latter will have the original 
placeholder name, while the former will have the name used in mangling.



I don't understand what the qualifier business is trying to accomplish,
especially since you never use type_decl_type.  You do this in several
places, but it should never be necessary; classes don't have any qualifiers.


We used to not have the qualifier business, assuming that classes
did not have any type qualifiers.  This turned out not to be a true
assumption.  Occasionally we were getting a case where a class had a
const qualifier attached to it *sometimes*.


Why?  You are getting a qualified variant of the class somehow.  Where 
is it coming from?



Here you're doing two hash table lookups when one would be enough.


As written the insert function doesn't return anything to let you know
whether the item was already there or not, which we need to know (we
use the results here to avoid generating redundant calls to
__VLTRegisterPair.  I suppose we could modify the insert function to
return a boolean indicating if the item was already in the hashtable,
and then we could get by with just one call here...


Yep, that's what I was thinking.


For that matter, you don't need the array, either; you can just use TYPE_UID
for a bitmap key and use htab_traverse to iterate over all elements.


I don't understand how this would work.  I think we need the vec, at
least, to have direct access based on TYPE_UID (which is also the vec
index).


TYPE_UID is already a property of the type, different from the class_uid 
in your patch.


But yes, I guess you do need some way to get from your index back to the 
type, so never mind.



+guess_num_vtable_pointers (struct vtv_graph_node *class_node)


I would think it would be better to pass the unrounded count to the library,
and let the library decide how to adjust that number for allocation.


If there is any computation we can do at compile-time rather than
run-time, we would rather do it at compile time.


I guess that makes sense.


+  var_name = ACONCAT ((_ZN4_VTVI, IDENTIFIER_POINTER (base_id),
+   E12__vtable_mapE, NULL));



$ c++filt _ZN4_VTVISt13bad_exceptionE12__vtable_mapE
_VTVstd::bad_exception::__vtable_map


Interesting.  Does this _VTV template appear anywhere else?

Even if we stay with this approach to producing the name, I'd like it to 
happen in a (new) function in mangle.c.



+reset_type_qualifiers (unsigned int new_quals, tree type_node)


This function is not safe and should be removed; as mentioned above, it
shouldn't be needed anyway.


As I explained above, we originally didn't have it and then found we
really needed it.   If you know of a safer or better way to accomplish
the same thing we would be happy to hear about it.


TYPE_MAIN_VARIANT will give you an unqualified variant of any qualified 
type.


Jason



[PATCH][ARM][2/2] Load-acquire, store-release atomics in AArch32 ARMv8

2013-02-01 Thread Kyrylo Tkachov
Hi all,
This patch adds the tests for the ARMv8 AArch32 implementation of atomics.
It refactors some aarch64 tests and reuses them.

Ok for trunk or for the next stage 1 (together with part 1 at
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01441.html)?

Thanks,
Kyrill


gcc/testsuite/ChangeLog

2013-01-25  Kyrylo Tkachov  kyrylo.tkachov at arm.com

* gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Move test
body
from here...
* gcc.target/aarch64/atomic-comp-swap-release-acquire.x: ... to
here.
* gcc.target/aarch64/atomic-op-acq_rel.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-acq_rel.x: ... to here.
* gcc.target/aarch64/atomic-op-acquire.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-acquire.x: ... to here.
* gcc.target/aarch64/atomic-op-char.c: Move test body from here...
* gcc.target/aarch64/atomic-op-char.x: ... to here.
* gcc.target/aarch64/atomic-op-consume.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-consume.x: ... to here.
* gcc.target/aarch64/atomic-op-int.c: Move test body from here...
* gcc.target/aarch64/atomic-op-int.x: ... to here.
* gcc.target/aarch64/atomic-op-relaxed.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-relaxed.x: ... to here.
* gcc.target/aarch64/atomic-op-release.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-release.x: ... to here.
* gcc.target/aarch64/atomic-op-seq_cst.c: Move test body from
here...
* gcc.target/aarch64/atomic-op-seq_cst.x: ... to here.
* gcc.target/aarch64/atomic-op-short.c: Move test body from here...
* gcc.target/aarch64/atomic-op-short.x: ... to here.
* gcc.target/arm/atomic-comp-swap-release-acquire.c: New test.
* gcc.target/arm/atomic-op-acq_rel.c: Likewise.
* gcc.target/arm/atomic-op-acquire.c: Likewise.
* gcc.target/arm/atomic-op-char.c: Likewise.
* gcc.target/arm/atomic-op-consume.c: Likewise.
* gcc.target/arm/atomic-op-int.c: Likewise.
* gcc.target/arm/atomic-op-relaxed.c: Likewise.
* gcc.target/arm/atomic-op-release.c: Likewise.
* gcc.target/arm/atomic-op-seq_cst.c: Likewise.
* gcc.target/arm/atomic-op-short.c: Likewise.





RE: [PATCH][ARM][2/2] Load-acquire, store-release atomics in AArch32 ARMv8

2013-02-01 Thread Kyrylo Tkachov
Ummm... forgot the patch, sorry!

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Kyrylo Tkachov
 Sent: 01 February 2013 17:37
 To: gcc-patches@gcc.gnu.org
 Cc: Ramana Radhakrishnan; Richard Earnshaw; Marcus Shawcroft
 Subject: [PATCH][ARM][2/2] Load-acquire, store-release atomics in
 AArch32 ARMv8
 
 Hi all,
 This patch adds the tests for the ARMv8 AArch32 implementation of
 atomics.
 It refactors some aarch64 tests and reuses them.
 
 Ok for trunk or for the next stage 1 (together with part 1 at
 http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01441.html)?
 
 Thanks,
 Kyrill
 
 
 gcc/testsuite/ChangeLog
 
 2013-01-25  Kyrylo Tkachov  kyrylo.tkachov at arm.com
 
   * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Move
 test
 body
   from here...
   * gcc.target/aarch64/atomic-comp-swap-release-acquire.x: ... to
 here.
   * gcc.target/aarch64/atomic-op-acq_rel.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-acq_rel.x: ... to here.
   * gcc.target/aarch64/atomic-op-acquire.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-acquire.x: ... to here.
   * gcc.target/aarch64/atomic-op-char.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-char.x: ... to here.
   * gcc.target/aarch64/atomic-op-consume.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-consume.x: ... to here.
   * gcc.target/aarch64/atomic-op-int.c: Move test body from here...
   * gcc.target/aarch64/atomic-op-int.x: ... to here.
   * gcc.target/aarch64/atomic-op-relaxed.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-relaxed.x: ... to here.
   * gcc.target/aarch64/atomic-op-release.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-release.x: ... to here.
   * gcc.target/aarch64/atomic-op-seq_cst.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-seq_cst.x: ... to here.
   * gcc.target/aarch64/atomic-op-short.c: Move test body from
 here...
   * gcc.target/aarch64/atomic-op-short.x: ... to here.
   * gcc.target/arm/atomic-comp-swap-release-acquire.c: New test.
   * gcc.target/arm/atomic-op-acq_rel.c: Likewise.
   * gcc.target/arm/atomic-op-acquire.c: Likewise.
   * gcc.target/arm/atomic-op-char.c: Likewise.
   * gcc.target/arm/atomic-op-consume.c: Likewise.
   * gcc.target/arm/atomic-op-int.c: Likewise.
   * gcc.target/arm/atomic-op-relaxed.c: Likewise.
   * gcc.target/arm/atomic-op-release.c: Likewise.
   * gcc.target/arm/atomic-op-seq_cst.c: Likewise.
   * gcc.target/arm/atomic-op-short.c: Likewise.
 
 
 
diff --git 
a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c 
b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
index 1492e25..9785bca 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
@@ -1,41 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options -O2 } */
 
-#define STRONG 0
-#define WEAK 1
-int v = 0;
-
-int
-atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b)
-{
-  return __atomic_compare_exchange (v, a, b,
-   STRONG, __ATOMIC_RELEASE,
-   __ATOMIC_ACQUIRE);
-}
-
-int
-atomic_compare_exchange_WEAK_RELEASE_ACQUIRE (int a, int b)
-{
-  return __atomic_compare_exchange (v, a, b,
-   WEAK, __ATOMIC_RELEASE,
-   __ATOMIC_ACQUIRE);
-}
-
-int
-atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b)
-{
-  return __atomic_compare_exchange_n (v, a, b,
- STRONG, __ATOMIC_RELEASE,
- __ATOMIC_ACQUIRE);
-}
-
-int
-atomic_compare_exchange_n_WEAK_RELEASE_ACQUIRE (int a, int b)
-{
-  return __atomic_compare_exchange_n (v, a, b,
- WEAK, __ATOMIC_RELEASE,
- __ATOMIC_ACQUIRE);
-}
+#include atomic-comp-swap-release-acquire.x
 
 /* { dg-final { scan-assembler-times ldaxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 4 } 
} */
 /* { dg-final { scan-assembler-times stlxr\tw\[0-9\]+, w\[0-9\]+, 
\\\[x\[0-9\]+\\\] 4 } } */
diff --git 
a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x 
b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x
new file mode 100644
index 000..4403afd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x
@@ -0,0 +1,36 @@
+
+#define STRONG 0
+#define WEAK 1
+int v = 0;
+
+int
+atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b)
+{
+  return __atomic_compare_exchange (v, a, b,
+   STRONG, __ATOMIC_RELEASE,
+   __ATOMIC_ACQUIRE);
+}
+
+int

[PATCH 1/6] [AArch64-4.7] Fix warning - Initialise generic_tunings.

2013-02-01 Thread James Greenhalgh

Hi,

This patch moves the various tuning parameter data structures
further up config/aarch64/aarch64.c and then uses them to
initialise the generic_tunings variable. This mirrors their
position on trunk.

This fixes the warning:

config/aarch64/aarch64.c:129:33: warning: uninitialised const ‘generic_tunings’ 
is invalid in C++ [-Wc++-compat]

Regression tested on aarch64-none-elf with no regressions.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c (generic_tunings): Initialise.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 40f438d..59124eb 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -125,8 +125,70 @@ unsigned long aarch64_isa_flags = 0;
 /* Mask to specify which instruction scheduling options should be used.  */
 unsigned long aarch64_tune_flags = 0;
 
-/* Tuning models.  */
-static const struct tune_params generic_tunings;
+/* Tuning parameters.  */
+
+#if HAVE_DESIGNATED_INITIALIZERS
+#define NAMED_PARAM(NAME, VAL) .NAME = (VAL)
+#else
+#define NAMED_PARAM(NAME, VAL) (VAL)
+#endif
+
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
+static const struct cpu_rtx_cost_table generic_rtx_cost_table =
+{
+  NAMED_PARAM (memory_load, COSTS_N_INSNS (1)),
+  NAMED_PARAM (memory_store, COSTS_N_INSNS (0)),
+  NAMED_PARAM (register_shift, COSTS_N_INSNS (1)),
+  NAMED_PARAM (int_divide, COSTS_N_INSNS (6)),
+  NAMED_PARAM (float_divide, COSTS_N_INSNS (2)),
+  NAMED_PARAM (double_divide, COSTS_N_INSNS (6)),
+  NAMED_PARAM (int_multiply, COSTS_N_INSNS (1)),
+  NAMED_PARAM (int_multiply_extend, COSTS_N_INSNS (1)),
+  NAMED_PARAM (int_multiply_add, COSTS_N_INSNS (1)),
+  NAMED_PARAM (int_multiply_extend_add, COSTS_N_INSNS (1)),
+  NAMED_PARAM (float_multiply, COSTS_N_INSNS (0)),
+  NAMED_PARAM (double_multiply, COSTS_N_INSNS (1))
+};
+
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
+static const struct cpu_addrcost_table generic_addrcost_table =
+{
+  NAMED_PARAM (pre_modify, 0),
+  NAMED_PARAM (post_modify, 0),
+  NAMED_PARAM (register_offset, 0),
+  NAMED_PARAM (register_extend, 0),
+  NAMED_PARAM (imm_offset, 0)
+};
+
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
+static const struct cpu_regmove_cost generic_regmove_cost =
+{
+  NAMED_PARAM (GP2GP, 1),
+  NAMED_PARAM (GP2FP, 2),
+  NAMED_PARAM (FP2GP, 2),
+  /* We currently do not provide direct support for TFmode Q-Q move.
+ Therefore we need to raise the cost above 2 in order to have
+ reload handle the situation.  */
+  NAMED_PARAM (FP2FP, 4)
+};
+
+#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
+__extension__
+#endif
+
+static const struct tune_params generic_tunings =
+{
+  generic_rtx_cost_table,
+  generic_addrcost_table,
+  generic_regmove_cost,
+  NAMED_PARAM (memmov_cost, 4)
+};
 
 /* A processor implementing AArch64.  */
 struct processor
@@ -4504,71 +4566,6 @@ aarch64_memory_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED,
 
 static void initialize_aarch64_code_model (void);
 
-/* Tuning parameters.  */
-
-#if HAVE_DESIGNATED_INITIALIZERS
-#define NAMED_PARAM(NAME, VAL) .NAME = (VAL)
-#else
-#define NAMED_PARAM(NAME, VAL) (VAL)
-#endif
-
-#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
-__extension__
-#endif
-static const struct cpu_rtx_cost_table generic_rtx_cost_table =
-{
-  NAMED_PARAM (memory_load, COSTS_N_INSNS (1)),
-  NAMED_PARAM (memory_store, COSTS_N_INSNS (0)),
-  NAMED_PARAM (register_shift, COSTS_N_INSNS (1)),
-  NAMED_PARAM (int_divide, COSTS_N_INSNS (6)),
-  NAMED_PARAM (float_divide, COSTS_N_INSNS (2)),
-  NAMED_PARAM (double_divide, COSTS_N_INSNS (6)),
-  NAMED_PARAM (int_multiply, COSTS_N_INSNS (1)),
-  NAMED_PARAM (int_multiply_extend, COSTS_N_INSNS (1)),
-  NAMED_PARAM (int_multiply_add, COSTS_N_INSNS (1)),
-  NAMED_PARAM (int_multiply_extend_add, COSTS_N_INSNS (1)),
-  NAMED_PARAM (float_multiply, COSTS_N_INSNS (0)),
-  NAMED_PARAM (double_multiply, COSTS_N_INSNS (1))
-};
-
-#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
-__extension__
-#endif
-static const struct cpu_addrcost_table generic_addrcost_table =
-{
-  NAMED_PARAM (pre_modify, 0),
-  NAMED_PARAM (post_modify, 0),
-  NAMED_PARAM (register_offset, 0),
-  NAMED_PARAM (register_extend, 0),
-  NAMED_PARAM (imm_offset, 0)
-};
-
-#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
-__extension__
-#endif
-static const struct cpu_regmove_cost generic_regmove_cost =
-{
-  NAMED_PARAM (GP2GP, 1),
-  NAMED_PARAM (GP2FP, 2),
-  NAMED_PARAM (FP2GP, 2),
-  /* We currently do not provide direct support for TFmode Q-Q move.
- Therefore we need to raise the cost above 2 in order to have
- reload handle the situation.  */
-  NAMED_PARAM (FP2FP, 4)
-};
-
-#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
-__extension__
-#endif
-static const struct tune_params generic_tunings =
-{
-  

[Patch 0/6][AArch64-4.7] Fix warnings.

2013-02-01 Thread James Greenhalgh
Hi,

This patch series fixes a number of warnings in the AArch64 port on
the aarch64-4.7-branch.

The warnings fixed are:

---
[AArch64-4.7] Fix warning - Initialise generic_tunings.

config/aarch64/aarch64.c:129:33: warning: uninitialised const 
‘generic_tunings’ is invalid in C++ [-Wc++-compat]
---
[AArch64-4.7] Fix warning - aarch64_add_constant mixed code and declarations.

config/aarch64/aarch64.c: In function ‘aarch64_add_constant’:
config/aarch64/aarch64.c:2249:4: warning: ISO C90 forbids mixed 
declarations and code [-pedantic]
---
[AArch64-4.7] Fix warning - aarch64_legitimize_reload_address passes
  the wrong type to push_reload.

config/aarch64/aarch64.c: In function ‘aarch64_legitimize_reload_address’:
config/aarch64/aarch64.c:3641:6: warning: enum conversion when passing 
argument 11 of ‘push_reload’ is invalid in C++ [-Wc++-compat]
---
[AArch64-4.7] Fix warning - aarch64_trampoline_init passes the wrong
  type to emit_library_call.

config/aarch64/aarch64.c: In function ‘aarch64_trampoline_init’:
config/aarch64/aarch64.c:3893:8: warning: enum conversion when passing 
argument 2 of ‘emit_library_call’ is invalid in C++ [-Wc++-compat]
---
[AArch64-4.7] Fix warning - Mixed code and declarations in
  aarch64_simd_const_bounds.

config/aarch64/aarch64.c: In function ‘aarch64_simd_const_bounds’:
config/aarch64/aarch64.c:6412:3: warning: ISO C90 forbids mixed 
declarations and code [-pedantic]
---
[AArch64-4.7] Backport: Fix warning in aarch64.md

config/aarch64/aarch64.md:840: warning: source missing a mode?
---

The patch series as a whole has been regression tested against
aarch64-none-elf with no regressions.

Are these patches OK to commit to aarch64-4.7-branch?

Thanks,
James Greenhalgh

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c (generic_tunings): Initialise.

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_add_constant): Move declaration of 'shift' above code.

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_legitimize_reload_address): Cast 'type' before
passing to push_reload.

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_trampoline_init): Pass 'LCT_NORMAL' rather than '0'
to emit_library_call.

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_simd_const_bounds): Move declaration of 'lane' above code.

2013-02-01  James Greenhalgh  james.greenha...@arm.com

Backport from mainline.
2012-12-18  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.md (insv_immmode): Add modes
for source operands.

[PATCH 2/6] [AArch64-4.7] Fix warning - aarch64_add_constant mixed code and declarations.

2013-02-01 Thread James Greenhalgh

Hi,

In config/aarch64/aarch64.c::aarch64_add_constant `shift' was
declared after we started writing code. C90 doesn't like this,
so split the declaration and the assignment.

This fixes the warning:

config/aarch64/aarch64.c: In function ‘aarch64_add_constant’:
config/aarch64/aarch64.c:2249:4: warning: ISO C90 forbids mixed declarations 
and code [-pedantic]

Regression tested on aarch64-none-elf with no regressions.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_add_constant): Move declaration of 'shift' above code.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 59124eb..62d0a12 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2307,8 +2307,9 @@ aarch64_add_constant (int regnum, int scratchreg, HOST_WIDE_INT delta)
 {
   if (mdelta = 4096)
 	{
+	  rtx shift;
 	  emit_insn (gen_rtx_SET (Pmode, scratch_rtx, GEN_INT (mdelta / 4096)));
-	  rtx shift = gen_rtx_ASHIFT (Pmode, scratch_rtx, GEN_INT (12));
+	  shift = gen_rtx_ASHIFT (Pmode, scratch_rtx, GEN_INT (12));
 	  if (delta  0)
 	emit_insn (gen_rtx_SET (Pmode, this_rtx,
 gen_rtx_MINUS (Pmode, this_rtx, shift)));

[PATCH 3/6] [AArch64-4.7] Fix warning - aarch64_legitimize_reload_address passes the wrong type to push_reload.

2013-02-01 Thread James Greenhalgh

Hi,

push_reload takes an `enum reload_type' as its final argument.

On trunk we just cast the int we have to the correct type,
so we do that here to mirror trunk and correct the warning.
We can't fix this by changing the type of the argument we take
as we would then need to forward declare the enum when giving
the prototype, which is illegal.

This fixes the warning:

config/aarch64/aarch64.c: In function ‘aarch64_legitimize_reload_address’:
config/aarch64/aarch64.c:3641:6: warning: enum conversion when passing argument 
11 of ‘push_reload’ is invalid in C++ [-Wc++-compat]

Regression tested aarch64-none-elf with no regressions.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_legitimize_reload_address): Cast 'type' before
passing to push_reload.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 62d0a12..fef2983 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3701,7 +3701,7 @@ aarch64_legitimize_reload_address (rtx *x_p,
   x = copy_rtx (x);
   push_reload (orig_rtx, NULL_RTX, x_p, NULL,
 		   BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
-		   opnum, type);
+		   opnum, (enum reload_type) type);
   return x;
 }
 
@@ -3714,7 +3714,7 @@ aarch64_legitimize_reload_address (rtx *x_p,
 {
   push_reload (XEXP (x, 0), NULL_RTX, XEXP (x, 0), NULL,
 		   BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
-		   opnum, type);
+		   opnum, (enum reload_type) type);
   return x;
 }
 
@@ -3778,7 +3778,7 @@ aarch64_legitimize_reload_address (rtx *x_p,
 
   push_reload (XEXP (x, 0), NULL_RTX, XEXP (x, 0), NULL,
 		   BASE_REG_CLASS, Pmode, VOIDmode, 0, 0,
-		   opnum, type);
+		   opnum, (enum reload_type) type);
   return x;
 }
 

Re: [ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem

2013-02-01 Thread Ramana Radhakrishnan





Here is a new version of my patch, with the cleanup you requested.

2012-12-18  Christophe Lyon  christophe.l...@linaro.org

 gcc/
 * config/arm/arm-protos.h (tune_params): Add
 prefer_neon_for_64bits field.
 * config/arm/arm.c (prefer_neon_for_64bits): New variable.
 (arm_slowmul_tune): Default prefer_neon_for_64bits to false.
 (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto.
 (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto.
 (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto.
 (arm_cortex_a9_tune, arm_fa726te_tune): Ditto.
 (arm_option_override): Handle -mneon-for-64bits new option.
 * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro.
 (prefer_neon_for_64bits): Declare new variable.
 * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to
 avoid_neon_for_64bits and neon_for_64bits. Remove onlya8 and
 nota8.
 (arch_enabled): Handle new arch types. Remove support for onlya8
 and nota8.
 (one_cmpldi2): Use new arch names.
 * config/arm/arm.opt (mneon-for-64bits): Add option.
 * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon)
 (anddi3_neon, xordi3_neon, ashldi3_neon, shiftdi3_neon): Use
 neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead
 of onlya8.
 * doc/invoke.texi (-mneon-for-64bits): Document.

 gcc/testsuite/
 * gcc.target/arm/neon-for-64bits-1.c: New tests.
 * gcc.target/arm/neon-for-64bits-2.c: Likewise.





Ok for 4.9 stage1 now.

regards
Ramana



[PATCH 5/6] [AArch64-4.7] Fix warning - Mixed code and declarations in aarch64_simd_const_bounds.

2013-02-01 Thread James Greenhalgh

Hi,

aarch64_simd_const_bounds declares `lane' after an assert. This
patch moves the declaration above the assert.

This patch fixes the warning:

config/aarch64/aarch64.c: In function ‘aarch64_simd_const_bounds’:
config/aarch64/aarch64.c:6412:3: warning: ISO C90 forbids mixed declarations 
and code [-pedantic]

Regression tested on aarch64-none-elf with no regressions.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_simd_const_bounds): Move declaration of 'lane' above code.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 434ccd7..a3c482b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6406,8 +6406,9 @@ aarch64_simd_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high)
 void
 aarch64_simd_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high)
 {
+  HOST_WIDE_INT lane;
   gcc_assert (GET_CODE (operand) == CONST_INT);
-  HOST_WIDE_INT lane = INTVAL (operand);
+  lane = INTVAL (operand);
 
   if (lane  low || lane = high)
 error (constant out of range);

[PATCH 4/6] [AArch64-4.7] Fix warning - aarch64_trampoline_init passes the wrong type to emit_library_call.

2013-02-01 Thread James Greenhalgh

Hi,

emit_library_call takes an `enum library_type` as its second argument.
Currently aarch64-4.7-branch passes it an int 0.

This patch fixes this, mirroring trunk, by passing LCT_NORMAL instead.

This patch fixes the warning:

config/aarch64/aarch64.c: In function ‘aarch64_trampoline_init’:
config/aarch64/aarch64.c:3893:8: warning: enum conversion when passing argument 
2 of ‘emit_library_call’ is invalid in C++ [-Wc++-compat]

Regression tested on aarch64-none-elf with no regressions.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.c
(aarch64_trampoline_init): Pass 'LCT_NORMAL' rather than '0'
to emit_library_call.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fef2983..434ccd7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3952,7 +3952,7 @@ aarch64_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
  gen_clear_cache().  */
   a_tramp = XEXP (m_tramp, 0);
   emit_library_call (gen_rtx_SYMBOL_REF (Pmode, __clear_cache),
-		 0, VOIDmode, 2, a_tramp, Pmode,
+		 LCT_NORMAL, VOIDmode, 2, a_tramp, Pmode,
 		 plus_constant (a_tramp, TRAMPOLINE_SIZE), Pmode);
 }
 

[PATCH 6/6] [AArch64-4.7] Backport: Fix warning in aarch64.md

2013-02-01 Thread James Greenhalgh

Hi,

This patch is a backport of one approved here:
http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01135.html

The patch fixes the warning:

config/aarch64/aarch64.md:840: warning: source missing a mode?

Regression tested with no regressions on aarch64-none-elf.

OK for aarch64-4.7-branch?

Thanks,
James

---
gcc/

2013-02-01  James Greenhalgh  james.greenha...@arm.com

Backport from mainline.
2012-12-18  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.md (insv_immmode): Add modes
for source operands.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 6f51469..9bb95e0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -840,8 +840,8 @@
 (define_insn insv_immmode
   [(set (zero_extract:GPI (match_operand:GPI 0 register_operand +r)
 			  (const_int 16)
-			  (match_operand 1 const_int_operand n))
-	(match_operand 2 const_int_operand n))]
+			  (match_operand:GPI 1 const_int_operand n))
+	(match_operand:GPI 2 const_int_operand n))]
   INTVAL (operands[1])  GET_MODE_BITSIZE (MODEmode)
 INTVAL (operands[1]) % 16 == 0
 INTVAL (operands[2]) = 0x

Re: Fix for PR55561 race condition in libgomp

2013-02-01 Thread Dmitry Vyukov
LGTM

On Thu, Jan 31, 2013 at 8:54 PM, VandeVondele  Joost
joost.vandevond...@mat.ethz.ch wrote:
 The updated changelog entry is below, but somebody with write access should 
 do the commit, please.

 2013-01-31  Dmitry Vyukov  dvyu...@gcc.gnu.org
   Joost VandeVondele joost.vandevond...@mat.ethz.ch

 PR libgomp/55561
 * config/linux/wait.h (do_spin): Use atomic load for addr.
 * config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic
 for intptr and ptrlock.
 * config/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load
 for ptrlock.


[PATCH, RFC] GCC 4.9, powerpc, allow TImode in VSX registers

2013-02-01 Thread Michael Meissner
When I did the initial power7 port, I punted on allowing TImode in the VSX
registers because I couldn't get it to work.  I am now revisiting it, and these
patches are my current effort, and I was wondering if people had comments on
them.  In terms of performance, there are two benchmarks in the Spec 2006 suite
that have minor regressions (perlbench and gamess), and 3 that have minor
improvements (hmmer, h264ref, and gromacs), so overall it looks like a wash.  I
do want to look the regressions, and see if there is something simple to
tweak.

Some issues I ran into include:

I needed to set CANNOT_CHANGE_MODE so that TImode won't overlap with smaller
data types, due to the scalar portion of the register being in the upper
64-bits of the VSX register.

I limited the available address formats for TImode to be REG+REG needed for VSX
instructions.

I discovered that setjmp/longjmp and exception handling needed to create TImode
values with the STACK_SAVEAREA_MODE macro.  However, the implementation of this
needs REG+OFFSET addressing.  So, I added a new type PTImode, which is only
used for STACK_SAVEAREA_MODE, and PTImode is limited to the GPRs.

If I enable logical operations in TImode mode (and, xor, etc.), the compiler
will convert DImode logical operations to TImode for 32-bit programs.  In the
future, I think I will tune this and/or provide insn splitters for DImode
logical operations.  For now, I just disallow logical operations on TImode if
32-bit.

I added a debug switch (-mvsx-timode) to disable putting TImode into VSX
registers.

2013-01-31  Michael Meissner  meiss...@linux.vnet.ibm.com

* config/rs6000/vector.md (mulmode3): Use the combined macro
VECTOR_UNIT_ALTIVEC_OR_VSX_P instead of separate calls to
VECTOR_UNIT_ALTIVEC_P and VECTOR_UNIT_VSX_P.
(vcondmodemode): Likewise.
(vcondumodemode): Likewise.
(vector_gtumode): Likewise.
(vector_gtemode): Likewise.
(xormode3): Don't allow logical operations on TImode in 32-bit
to prevent the compiler from converting DImode operations to
TImode.
(iormode3): Likewise.
(andmode3): Likewise.
(one_cmplmode2): Likewise.
(normode3): Likewise.
(andcmode3): Likewise.

* config/rs6000/constraints.md (wt constraint): New constraint
that returns VSX_REGS if TImode is allowed in VSX registers.

* config/rs6000/predicates.md (easy_fp_constant): 0.0f is an easy
constant under VSX.

* config/rs6000/rs6000-modes.def (PTImode): Define, PTImode is
similar to TImode, but it is restricted to being in the GPRs.

* config/rs6000/rs6000.opt (-mvsx-timode): New switch to allow
TImode to occupy a single VSX register.

* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Default to
-mvsx-timode for power7/power8.
(power7 cpu): Likewise.
(power8 cpu): Likewise.

* config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Make
sure that TFmode/TDmode take up two registers if they are ever
allowed in the upper VSX registers.
(rs6000_hard_regno_mode_ok): If -mvsx-timode, allow TImode in VSX
registers.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_debug_reg_global): Add debugging for PTImode and wt
constraint.  Print if LRA is turned on.
(rs6000_option_override_internal): Give an error if -mvsx-timode
and VSX is not enabled.
(invalid_e500_subreg): Handle PTImode, restricting it to GPRs.  If
-mvsx-timode, restrict TImode to reg+reg addressing, and PTImode
to reg+offset addressing.  Use PTImode when checking offset
addresses for validity.
(reg_offset_addressing_ok_p): Likewise.
(rs6000_legitimate_offset_address_p): Likewise.
(rs6000_legitimize_address): Likewise.
(rs6000_legitimize_reload_address): Likewise.
(rs6000_legitimate_address_p): Likewise.
(rs6000_eliminate_indexed_memrefs): Likewise.
(rs6000_emit_move): Likewise.
(rs6000_secondary_reload): Likewise.
(rs6000_secondary_reload_inner): Handle PTImode.  Allow 64-bit
reloads to fpr registers to continue to use reg+offset addressing,
but 64-bit reloads to altivec registers need reg+reg addressing.
Drop test for PRE_MODIFY, since VSX loads/stores no longer support
it.  Treat LO_SUM like a PLUS operation.
(rs6000_secondary_reload_class): If type is 64-bit, prefer to use
FLOAT_REGS instead of VSX_RGS to allow use of reg+offset
addressing.
(rs6000_cannot_change_mode_class): Do not allow TImode in VSX
registers to share a register with a smaller sized type, since VSX
puts scalars in the upper 64-bits.
(print_operand): Add support for PTImode.
(rs6000_register_move_cost): Use VECTOR_MEM_VSX_P instead of
VECTOR_UNIT_VSX_P to catch types that 

Fwd: Re: Export _Prime_rehash_policy symbols

2013-02-01 Thread François Dumont

Test successful so attached patch applied.

2013-02-01  François Dumont fdum...@gcc.gnu.org

* include/bits/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt)
(_Prime_rehash_policy::_M_need_rehash): Move definition...
* src/c++11/hashtable_c++0x.cc: ... here.
* src/shared/hashtable-aux.cc: Remove c++config.h include.
* config/abi/gnu.ver (GLIBCXX_3.4.18): Export _Prime_rehash_policy
symbols.

François


On 01/30/2013 11:12 AM, Paolo Carlini wrote:
... before committing, please double check that we aren't breaking 
|--enable-symvers=|gnu-versioned-namespace, wouldn't be the first time 
that we do that and we notice only much later. At minimum build with 
it and run the testsuite.


Paolo.





Index: include/bits/hashtable_policy.h
===
--- include/bits/hashtable_policy.h	(revision 195557)
+++ include/bits/hashtable_policy.h	(working copy)
@@ -369,7 +369,8 @@
 
 // Return a bucket count appropriate for n elements
 std::size_t
-_M_bkt_for_elements(std::size_t __n) const;
+_M_bkt_for_elements(std::size_t __n) const
+{ return __builtin_ceil(__n / (long double)_M_max_load_factor); }
 
 // __n_bkt is current bucket count, __n_elt is current element count,
 // and __n_ins is number of elements to be inserted.  Do we need to
@@ -397,77 +398,6 @@
 mutable std::size_t  _M_next_resize;
   };
 
-  extern const unsigned long __prime_list[];
-
-  // XXX This is a hack.  There's no good reason for any of
-  // _Prime_rehash_policy's member functions to be inline.
-
-  // Return a prime no smaller than n.
-  inline std::size_t
-  _Prime_rehash_policy::
-  _M_next_bkt(std::size_t __n) const
-  {
-// Optimize lookups involving the first elements of __prime_list.
-// (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
-
-if (__n = 11)
-  {
-	_M_next_resize
-	  = __builtin_ceil(__fast_bkt[__n]
-			   * (long double)_M_max_load_factor);
-	return __fast_bkt[__n];
-  }
-
-const unsigned long* __next_bkt
-  = std::lower_bound(__prime_list + 5, __prime_list + _S_n_primes,
-			 __n);
-_M_next_resize
-  = __builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
-return *__next_bkt;
-  }
-
-  // Return the smallest integer p such that alpha p = n, where alpha
-  // is the load factor.
-  inline std::size_t
-  _Prime_rehash_policy::
-  _M_bkt_for_elements(std::size_t __n) const
-  { return __builtin_ceil(__n / (long double)_M_max_load_factor); }
-
-  // Finds the smallest prime p such that alpha p  __n_elt + __n_ins.
-  // If p  __n_bkt, return make_pair(true, p); otherwise return
-  // make_pair(false, 0).  In principle this isn't very different from
-  // _M_bkt_for_elements.
-
-  // The only tricky part is that we're caching the element count at
-  // which we need to rehash, so we don't have to do a floating-point
-  // multiply for every insertion.
-
-  inline std::pairbool, std::size_t
-  _Prime_rehash_policy::
-  _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt,
-		 std::size_t __n_ins) const
-  {
-if (__n_elt + __n_ins = _M_next_resize)
-  {
-	long double __min_bkts = (__n_elt + __n_ins)
- / (long double)_M_max_load_factor;
-	if (__min_bkts = __n_bkt)
-	  return std::make_pair(true,
-	_M_next_bkt(std::maxstd::size_t(__builtin_floor(__min_bkts) + 1,
-	  __n_bkt * _S_growth_factor)));
-	else
-	  {
-	_M_next_resize
-	  = __builtin_floor(__n_bkt * (long double)_M_max_load_factor);
-	return std::make_pair(false, 0);
-	  }
-  }
-else
-  return std::make_pair(false, 0);
-  }
-
   // Base classes for std::_Hashtable.  We define these base classes
   // because in some cases we want to do different things depending on
   // the value of a policy class.  In some cases the policy class
Index: src/shared/hashtable-aux.cc
===
--- src/shared/hashtable-aux.cc	(revision 195557)
+++ src/shared/hashtable-aux.cc	(working copy)
@@ -1,6 +1,6 @@
 // std::__detail and std::tr1::__detail definitions -*- C++ -*-
 
-// Copyright (C) 2007, 2009, 2011 Free Software Foundation, Inc.
+// Copyright (C) 2007-2013 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -22,8 +22,6 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // http://www.gnu.org/licenses/.
 
-#include bits/c++config.h
-
 namespace __detail
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
Index: src/c++11/hashtable_c++0x.cc
===
--- src/c++11/hashtable_c++0x.cc	(revision 195557)
+++ src/c++11/hashtable_c++0x.cc	(working copy)
@@ -1,6 +1,6 @@
 // std::__detail definitions -*- C++ -*-
 
-// Copyright (C) 2007, 2008, 2009, 

[lra] merged with trunk

2013-02-01 Thread Vladimir Makarov

The branch was merged with trunk @ 195676.

The branch was successfully bootstrapped on x86/x86-64.

Committed as rev. 195679.


Re: patch to fix PR56144

2013-02-01 Thread Vladimir Makarov

On 13-01-31 6:36 PM, Steven Bosscher wrote:

On Wed, Jan 30, 2013 at 6:24 PM, Vladimir Makarov wrote:

The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56144

The patch was successfully bootstrapped and tested on x86/x86-64.

Hello Vlad,

Can you please put this patch on the lra-branch too, so that the
auto-testers can pick it up?


I've merged the branch with trunk instead.



[Google 4.7 Split Dwarf] Use .debug_str for some strings. (issue7241067)

2013-02-01 Thread Sterling Augustine
The enclosed patch for Google 4.7 is an optimization for debug strings
under -gsplit-dwarf.  Currently under -gsplit-dwarf, all strings with
DW_FORM_strp end up in the .debug_str.dwo section, which requires any
string not destined for the .dwo to use DW_FORM_string, disallowing any
duplication removal.

With this patch, gcc creates a normal .debug_str section even under
-gsplit-dwarf, and puts any DW_FORM_strp string destined for the .o
file into that section.

Tested with full bootstrap and the gdb test suite.

OK for Google 4.7?

When stage 1 opens again, I expect I will port it there as well.

Sterling

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 195673)
+++ gcc/dwarf2out.c (working copy)
@@ -166,6 +166,7 @@
 static GTY(()) section *debug_pubnames_section;
 static GTY(()) section *debug_pubtypes_section;
 static GTY(()) section *debug_str_section;
+static GTY(()) section *debug_str_dwo_section;
 static GTY(()) section *debug_str_offsets_section;
 static GTY(()) section *debug_ranges_section;
 static GTY(()) section *debug_frame_section;
@@ -217,6 +218,28 @@
 
 static GTY ((param_is (struct indirect_string_node))) htab_t debug_str_hash;
 
+/* With split_debug_info, both the comp_dir and dwo_name go in the
+   main object file, rather than the dwo, similar to the force_direct
+   parameter elsewhere but with additional complications:
+
+   1) The string is needed in both the main object file and the dwo.
+   That is, the comp_dir and dwo_name will appear in both places.
+
+   2) Strings can use three forms: DW_FORM_string, DW_FORM_strp or
+   DW_FORM_GNU_str_index.
+
+   3) GCC chooses the form to use late, depending on the size and
+   reference count.
+
+   Rather than forcing the all debug string handling functions and
+   callers to deal with these complications, simply use a separate,
+   special-cased string table for any attribute that should go in the
+   main object file.  This limits the complexity to just the places
+   that need it.  */
+
+static GTY ((param_is (struct indirect_string_node)))
+  htab_t skeleton_debug_str_hash;
+
 static GTY(()) int dw2_string_counter;
 
 /* True if the compilation unit places functions in more than one section.  */
@@ -3593,6 +3616,8 @@
 static void schedule_generic_params_dies_gen (tree t);
 static void gen_scheduled_generic_parms_dies (void);
 
+static const char *comp_dir_string (void);
+
 /* enum for tracking thread-local variables whose address is really an offset
relative to the TLS pointer, which will need link-time relocation, but will
not need relocation by the DWARF consumer.  */
@@ -3710,11 +3735,11 @@
   (!dwarf_split_debug_info  \
? (DEBUG_NORM_STR_OFFSETS_SECTION) : (DEBUG_DWO_STR_OFFSETS_SECTION))
 #endif
-#define DEBUG_DWO_STR_SECTION   .debug_str.dwo
-#define DEBUG_NORM_STR_SECTION  .debug_str
+#ifndef DEBUG_STR_DWO_SECTION
+#define DEBUG_STR_DWO_SECTION   .debug_str.dwo
+#endif
 #ifndef DEBUG_STR_SECTION
-#define DEBUG_STR_SECTION   \
-  (!dwarf_split_debug_info ? (DEBUG_NORM_STR_SECTION) : 
(DEBUG_DWO_STR_SECTION))
+#define DEBUG_STR_SECTION  .debug_str
 #endif
 #ifndef DEBUG_RANGES_SECTION
 #define DEBUG_RANGES_SECTION   .debug_ranges
@@ -3726,17 +3751,18 @@
 #endif
 
 /* Section flags for .debug_macinfo/.debug_macro section.  */
-#define DEBUG_MACRO_SECTION_FLAGS \
+#define DEBUG_MACRO_SECTION_FLAGS   \
   (dwarf_split_debug_info ? SECTION_DEBUG | SECTION_EXCLUDE : SECTION_DEBUG)
 
 /* Section flags for .debug_str section.  */
-#define DEBUG_STR_SECTION_FLAGS \
-  (dwarf_split_debug_info \
-   ? SECTION_DEBUG | SECTION_EXCLUDE \
-   : (HAVE_GAS_SHF_MERGE  flag_merge_debug_strings \
-  ? SECTION_DEBUG | SECTION_MERGE | SECTION_STRINGS | 1\
-  : SECTION_DEBUG))
+#define DEBUG_STR_SECTION_FLAGS \
+  (HAVE_GAS_SHF_MERGE  flag_merge_debug_strings   \
+   ? SECTION_DEBUG | SECTION_MERGE | SECTION_STRINGS | 1\
+   : SECTION_DEBUG)
 
+/* Section flags for .debug_str.dwo section.  */
+#define DEBUG_STR_DWO_SECTION_FLAGS (SECTION_DEBUG | SECTION_EXCLUDE)
+
 /* Labels we insert at beginning sections we can reference instead of
the section names themselves.  */
 
@@ -4658,19 +4684,15 @@
 (const char *)x2) == 0;
 }
 
-/* Add STR to the indirect string hash table.  */
+/* Add STR to the given string hash table.  */
 
 static struct indirect_string_node *
-find_AT_string (const char *str)
+find_AT_string_in_table (const char *str, htab_t table)
 {
   struct indirect_string_node *node;
   void **slot;
 
-  if (! debug_str_hash)
-debug_str_hash = htab_create_ggc (10, debug_str_do_hash,
- debug_str_eq, NULL);
-
-  slot = htab_find_slot_with_hash (debug_str_hash, str,
+  slot = htab_find_slot_with_hash (table, str,