Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Richard Henderson r...@redhat.com wrote:

 On 06/26/2012 01:54 PM, Alexandre Oliva wrote:
 +  track_stack_pointer (dst, src1, src2);

 Why does this function return a value then?

During testing, I used an assert on the return value to catch cases that
couldn't be handled.  The comments before that function say:

+   ??? The return value, that was useful during testing, ended up
+   unused, but this single-use static function will be inlined, and
+   then the return value computation will be optimized out, so I'm
+   leaving it in.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote:

 On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
 Why?  We don't demand a working plugin.  Indeed, we disable the use of
 the plugin if we find a linker that doesn't support it.  We just don't
 account for the possibility of finding a linker that supports plugins,
 but that doesn't support the one we'll build later.

 If this is the preferred solution, then having configure check the
 64-bitness of ld and turning off the plugin altogether on mismatches
 sounds like a reasonable course of action to me.

I'd very be surprised if I asked for an i686 native build to package and
install elsewhere, and didn't get a plugin just because the build-time
linker wouldn't have been able to run the plugin.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Wed, Jun 27, 2012 at 02:37:08PM -0700, Richard Henderson wrote:
 
 I was sitting on this patch until I got around to fixing up Jakub's
 existing vector divmod code to use it.  But seeing as how he's adding
 more uses, I think it's better to get it in earlier.
 
 Tested via a patch sent under separate cover that changes
 __builtin_alpha_umulh to immediately fold to MULT_HIGHPART_EXPR.

Thanks.  Here is an incremental patch on top of my patch from yesterday
which expands some of the vector divisions/modulos using MULT_HIGHPART_EXPR
instead of VEC_WIDEN_MULT_*_EXPR + VEC_PERM_EXPR if backend supports that.
Improves code generated for ushort or short / or % on i?86 (slightly
complicated by the fact that unfortunately even -mavx2 doesn't support
vector by vector shifts for V{8,16}HImode (nor V{16,32}QImode), XOP does
though).

Ok for trunk?

I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and
vectorizing it as either of the sequences next.

2012-06-28  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/53645
* tree-vect-generic.c (expand_vector_divmod): Use MULT_HIGHPART_EXPR
instead of VEC_WIDEN_MULT_{HI,LO}_EXPR followed by VEC_PERM_EXPR
if possible.

* gcc.c-torture/execute/pr53645-2.c: New test.

--- gcc/tree-vect-generic.c.jj  2012-06-28 08:32:50.0 +0200
+++ gcc/tree-vect-generic.c 2012-06-28 09:10:51.436748834 +0200
@@ -455,7 +455,7 @@ expand_vector_divmod (gimple_stmt_iterat
   unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type)));
   optab op;
   tree *vec;
-  unsigned char *sel;
+  unsigned char *sel = NULL;
   tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem;
 
   if (prec  HOST_BITS_PER_WIDE_INT)
@@ -744,26 +744,34 @@ expand_vector_divmod (gimple_stmt_iterat
   if (mode == -2 || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
 return NULL_TREE;
 
-  op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default);
-  if (op == NULL
-  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
-return NULL_TREE;
-  op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default);
-  if (op == NULL
-  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
-return NULL_TREE;
-  sel = XALLOCAVEC (unsigned char, nunits);
-  for (i = 0; i  nunits; i++)
-sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1);
-  if (!can_vec_perm_p (TYPE_MODE (type), false, sel))
-return NULL_TREE;
-  wider_type
-= build_vector_type (build_nonstandard_integer_type (prec * 2, unsignedp),
-nunits / 2);
-  if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT
-  || GET_MODE_BITSIZE (TYPE_MODE (wider_type))
-!= GET_MODE_BITSIZE (TYPE_MODE (type)))
-return NULL_TREE;
+  op = optab_for_tree_code (MULT_HIGHPART_EXPR, type, optab_default);
+  if (op != NULL
+   optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
+wider_type = NULL_TREE;
+  else
+{
+  op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default);
+  if (op == NULL
+ || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
+   return NULL_TREE;
+  op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default);
+  if (op == NULL
+ || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
+   return NULL_TREE;
+  sel = XALLOCAVEC (unsigned char, nunits);
+  for (i = 0; i  nunits; i++)
+   sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1);
+  if (!can_vec_perm_p (TYPE_MODE (type), false, sel))
+   return NULL_TREE;
+  wider_type
+   = build_vector_type (build_nonstandard_integer_type (prec * 2,
+unsignedp),
+nunits / 2);
+  if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT
+ || GET_MODE_BITSIZE (TYPE_MODE (wider_type))
+!= GET_MODE_BITSIZE (TYPE_MODE (type)))
+   return NULL_TREE;
+}
 
   cur_op = op0;
 
@@ -772,7 +780,7 @@ expand_vector_divmod (gimple_stmt_iterat
 case 0:
   gcc_assert (unsignedp);
   /* t1 = oprnd0  pre_shift;
-t2 = (type) (t1 w* ml  prec);
+t2 = t1 h* ml;
 q = t2  post_shift;  */
   cur_op = add_rshift (gsi, type, cur_op, pre_shifts);
   if (cur_op == NULL_TREE)
@@ -801,30 +809,37 @@ expand_vector_divmod (gimple_stmt_iterat
   for (i = 0; i  nunits; i++)
 vec[i] = build_int_cst (TREE_TYPE (type), mulc[i]);
   mulcst = build_vector (type, vec);
-  for (i = 0; i  nunits; i++)
-vec[i] = build_int_cst (TREE_TYPE (type), sel[i]);
-  perm_mask = build_vector (type, vec);
-  mhi = gimplify_build2 (gsi, VEC_WIDEN_MULT_HI_EXPR, wider_type,
-cur_op, mulcst);
-  mhi = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, mhi);
-  mlo = gimplify_build2 (gsi, VEC_WIDEN_MULT_LO_EXPR, wider_type,
-cur_op, mulcst);
-  mlo = gimplify_build1 (gsi, 

Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
 On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote:
 
  On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
  Why?  We don't demand a working plugin.  Indeed, we disable the use of
  the plugin if we find a linker that doesn't support it.  We just don't
  account for the possibility of finding a linker that supports plugins,
  but that doesn't support the one we'll build later.
 
  If this is the preferred solution, then having configure check the
  64-bitness of ld and turning off the plugin altogether on mismatches
  sounds like a reasonable course of action to me.
 
 I'd very be surprised if I asked for an i686 native build to package and
 install elsewhere, and didn't get a plugin just because the build-time
 linker wouldn't have been able to run the plugin.

Not disable plugin support altogether, but disable assuming the linker
supports the plugin.  If user uses explicit -f{,no-}use-linker-plugin,
it is his problem to care that the linker has support.  But the problem
is that when build-time ld is new enough gcc assumes it has to support
the plugin.  And that is not the case.

Jakub


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Christophe Lyon christophe.l...@st.com wrote:

 I looked at the patch in there, and I'm afraid I don't understand how it
 achieves the ChangeLog-suggested purpose of ensuring -O2 makes to
 C*FLAGS_FOR_TARGET, when all it appears to do is to prepend -g.  Can you
 please clarify?

 With more context, the current code fragment is:
   CFLAGS_FOR_TARGET=$CFLAGS
   case  $CFLAGS  in
 * -O2 *) ;;
 *) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;;
   esac
   case  $CFLAGS  in
 * -g * | * -g3 *) ;;
 *) CFLAGS_FOR_TARGET=-g $CFLAGS ;;
   esac

 where pre-pending -g discards -O2 if it was pre-pended just above.

I see, thanks for clarifying.

I suggest changing both occurrences of $CFLAGS within the case
statements, then; the more uniform logic is more appealing to me.

Patch approved with these changes.

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[Patch, Fortran] Handle C_F_POINTER with a noncontiguous SHAPE=

2012-06-28 Thread Tobias Burnus
This patch generates inline code for C_F_POINTER with an array argument. 
One reason is that GCC didn't handle SHAPE= arguments which were 
noncontiguous.


However, the real motivation is the fortran-dev branch with the new 
array-descriptor: C_F_POINTER needs then to set the stride multiplier, 
but as it doesn't know the size of a single element, one had either to 
pass the value or handle it partially in the front end. Hence, doing it 
all in the front-end was simpler. The C_F_Pointer issue is the main 
cause for failing test cases on the branch, though several other issues 
remain.


Build and regtested on x86-64-linux-
OK for the trunk?

* * *

If you wonder why I had some problems before: 
http://gcc.gnu.org/ml/fortran/2012-04/msg00115.html


The reason is that I called pushlevel() twice for body:

+  gfc_start_block (body);
+  gfc_start_scalarized_body (loop, body);


I removed the first one - and now it works. (Well, there were also some 
other issues in the patch, which are now fixed.)


Tobias

PS: After committal, I will update the patch for the branch; let's see 
how many failures will remain on the branch.


PPS: The offset handling in gfortran is really complicated. I wonder 
whether we have to (or at least should) change it for the new array 
descriptor.
2012-06-27  Tobias Burnus  bur...@net-b.de

	* trans-expr.c (conv_isocbinding_procedure): Generate c_f_pointer code
	inline.

2012-06-27  Tobias Burnus  bur...@net-b.de


	* gfortran.dg/c_f_pointer_shape_tests_5.f90: New.
	* gfortran.dg/c_f_pointer_tests_3.f90: Update
	scan-tree-dump-times pattern.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 7d1a6d4..9ebde9d 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -3307,14 +3351,17 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym,
   
   return 1;
 }
-  else if ((sym-intmod_sym_id == ISOCBINDING_F_POINTER
-	 arg-next-expr-rank == 0)
+  else if (sym-intmod_sym_id == ISOCBINDING_F_POINTER
 	   || sym-intmod_sym_id == ISOCBINDING_F_PROCPOINTER)
 {
-  /* Convert c_f_pointer if fptr is a scalar
-	 and convert c_f_procpointer.  */
+  /* Convert c_f_pointer and c_f_procpointer.  */
   gfc_se cptrse;
   gfc_se fptrse;
+  gfc_se shapese;
+  gfc_ss *ss, *shape_ss;
+  tree desc, dim, tmp, stride, offset;
+  stmtblock_t body, block, ifblock;
+  gfc_loopinfo loop;
 
   gfc_init_se (cptrse, NULL);
   gfc_conv_expr (cptrse, arg-expr);
@@ -3322,25 +3369,113 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym,
   gfc_add_block_to_block (se-post, cptrse.post);
 
   gfc_init_se (fptrse, NULL);
-  if (sym-intmod_sym_id == ISOCBINDING_F_POINTER
-	  || gfc_is_proc_ptr_comp (arg-next-expr, NULL))
-	fptrse.want_pointer = 1;
+  if (arg-next-expr-rank == 0)
+	{
+	  if (sym-intmod_sym_id == ISOCBINDING_F_POINTER
+	  || gfc_is_proc_ptr_comp (arg-next-expr, NULL))
+	fptrse.want_pointer = 1;
+
+	  gfc_conv_expr (fptrse, arg-next-expr);
+	  gfc_add_block_to_block (se-pre, fptrse.pre);
+	  gfc_add_block_to_block (se-post, fptrse.post);
+	  if (arg-next-expr-symtree-n.sym-attr.proc_pointer
+	   arg-next-expr-symtree-n.sym-attr.dummy)
+	fptrse.expr = build_fold_indirect_ref_loc (input_location,
+		   fptrse.expr);
+ 	  se-expr = fold_build2_loc (input_location, MODIFY_EXPR,
+  TREE_TYPE (fptrse.expr),
+  fptrse.expr,
+  fold_convert (TREE_TYPE (fptrse.expr),
+		cptrse.expr));
+	  return 1;
+	}
 
-  gfc_conv_expr (fptrse, arg-next-expr);
-  gfc_add_block_to_block (se-pre, fptrse.pre);
-  gfc_add_block_to_block (se-post, fptrse.post);
-  
-  if (arg-next-expr-symtree-n.sym-attr.proc_pointer
-	   arg-next-expr-symtree-n.sym-attr.dummy)
-	fptrse.expr = build_fold_indirect_ref_loc (input_location,
-		   fptrse.expr);
-  
-  se-expr = fold_build2_loc (input_location, MODIFY_EXPR,
-  TREE_TYPE (fptrse.expr),
-  fptrse.expr,
-  fold_convert (TREE_TYPE (fptrse.expr),
-		cptrse.expr));
+  gfc_start_block (block);
+
+  /* Get the descriptor of the Fortran pointer.  */
+  ss = gfc_walk_expr (arg-next-expr);
+  gcc_assert (ss != gfc_ss_terminator);
+  fptrse.descriptor_only = 1;
+  gfc_conv_expr_descriptor (fptrse, arg-next-expr, ss);
+  gfc_add_block_to_block (block, fptrse.pre);
+  desc = fptrse.expr;
+
+  /* Set data value, dtype, and offset.  */
+  tmp = GFC_TYPE_ARRAY_DATAPTR_TYPE (TREE_TYPE (desc));
+  gfc_conv_descriptor_data_set (block, desc,
+fold_convert (tmp, cptrse.expr));
+  gfc_add_modify (block, gfc_conv_descriptor_dtype (desc),
+		  gfc_get_dtype (TREE_TYPE (desc)));
+
+  /* Start scalarization of the bounds, using the shape argument.  */
+
+  shape_ss = gfc_walk_expr (arg-next-next-expr);
+  gcc_assert (shape_ss != gfc_ss_terminator);
+  gfc_init_se (shapese, NULL);
+
+  gfc_init_loopinfo 

Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Andreas Schwab
libgomp.texi is still using gpl.texi, although libgomp has been
relicensed to GPLv3 in 2009.  OK?

(This is the last use of gpl.texi in the gcc sources.  Perhaps it should
be removed and gpl_v3.texi renamed back to gpl.texi?)

Andreas.

* libgomp.texi: Include gpl_v3.texi instead of gpl.texi.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 29c078b..f8996f4 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -7,7 +7,7 @@
 
 
 @copying
-Copyright @copyright{} 2006, 2007, 2008, 2010, 2011 Free Software Foundation, 
Inc.
+Copyright @copyright{} 2006, 2007, 2008, 2010, 2011, 2012 Free Software 
Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
@@ -1737,7 +1737,7 @@ Bugs in the GNU OpenMP implementation should be reported 
via
 @c GNU General Public License
 @c -
 
-@include gpl.texi
+@include gpl_v3.texi
 
 
 
-- 
1.7.11.1


-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 10:18:49AM +0200, Andreas Schwab wrote:
 libgomp.texi is still using gpl.texi, although libgomp has been
 relicensed to GPLv3 in 2009.  OK?

Yes.

 
   * libgomp.texi: Include gpl_v3.texi instead of gpl.texi.

Jakub


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Matthew Gretton-Dann

On 27/06/12 21:35, Andrew Pinski wrote:

On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
matthew.gretton-d...@arm.com wrote:

All,

This patch enables the dump-noaddr test to work in out-of-build-tree
testing.

[snip]


I created a much simpler patch which I have been meaning to submit.
I attached it for reference.


Thanks,
Andrew Pinski

ChangeLog:
* testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
an absolute dump base instead of a relative one.

Index: gcc.c-torture/unsorted/dump-noaddr.x
===
--- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
+++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
  foreach option $option_list {
file delete -force dump1
file mkdir dump1
-   c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 
-x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr
+   c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase 
-DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr
file delete -force dump2
file mkdir dump2
-   c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 
-x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
+   c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase 
-DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
foreach dump1 [lsort [glob -nocomplain dump1/*]] {
regsub dump1/ $dump1 dump2/ dump2
set dumptail gcc.c-torture/unsorted/[file tail $dump1]


What I don't like about this approach is that dump1 and dump2 are created in 
the current working directory.  With out of build-tree testing this may not 
(I believe) be the same as $tmpdir (where temporaries are normally created). 
 Also the current directory may already contain directories/files called 
dump1 or dump2 which will get destroyed by running the testsuite.


Hence why my approach used tmpdir.

Does this reasoning make sense?

I've not committed my version yet in case I am missing something in my 
reasoning above with regards to the relationship between the current working 
directory and $tmpdir.


Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd




RE: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Zhenqiang Chen
 diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
 100644
 --- a/gcc/loop-init.c
 +++ b/gcc/loop-init.c
 @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
  static bool
  gate_rtl_move_loop_invariants (void)
  {
 +  /* In general, invariant motion can not reduce code size. But it
 + will
 +     change the liverange of the invariant, which increases the
 + register
 +     pressure and might lead to more spilling.  */
 +  if (optimize_function_for_size_p (cfun))
 +    return false;
 +

Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

Update it according to the comments.

Thanks!
-Zhenqiang

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f8405dd..b0e84a7 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1931,7 +1931,8 @@ move_loop_invariants (void)
   curr_loop = loop;
   /* move_single_loop_invariants for very large loops
 is time consuming and might need a lot of memory.  */
-  if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
+  if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
+  ! optimize_loop_nest_for_size_p (loop))
move_single_loop_invariants (loop);
 }

ChangeLog:
2012-06-28  Zhenqiang Chen zhenqiang.c...@arm.com

* loop-invariant.c (move_loop_invariants): Skip
move_single_loop_invariants when optimizing loop for size






RE: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Zhenqiang Chen
-Original Message-
From: Steven Bosscher [mailto:stevenb@gmail.com]
Sent: 2012年6月27日 16:54
To: Zhenqiang Chen
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Disable loop2_invariant for -Os

On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen
zhenqiang.c...@arm.com wrote:
 Hi,

 In general, invariant motion itself can not reduce code size. But it
 will change the liverange of the invariant, which might lead to more
spilling.

This may be true for ARM but it's not true in general. Sometimes
loop-invariant

Benchmark tests show it also benefits MIPS, PPC and X86 for code size.

address arithmetic, that is not exposed in GIMPLE, is profitable to hoist
out of
the loop. See e.g. PR41026 (for which I still have a patch in the queue).

If this goes in anyway, please mention PR39837 in your ChangeLog entry.

It can not handle the case.

Thanks!
-Zhenqiang





Re: [PATCH] Move Graphite from using PPL over to ISL

2012-06-28 Thread Tobias Grosser

On 06/27/2012 05:06 PM, Richard Guenther wrote:


This merges from the graphite branch the move of PPL to ISL,
and completes it where it was lacking - thanks to Micha.
It leaves unmerged the addition of a pluto-like ISL optimizer
as well as a bugfix for stride  1 which did not come with
a testcase.

With this patch (ontop of the one requiring ClooG 0.17.0)
we will require ISL 0.10 for enabling Graphite.

I've bootstrapped and built various combinations with in-tree
and out-of-tree cloog and ISL, so I'm pretty confident that
this works.

With out-of-tree ClooG and ISL a slightly older patch ontop of its
prerequesite passed bootstrap and testing on x86_64-unknown-linux-gnu.

Currently re-bootstrapping and testing on x86_64-unknown-linux-gnu.

Ok for trunk?


Hi Richard, hi Micha,

thanks a lot for pushing this forward. Especially the fast 
implementation of the interchange heuristic was impressive!
I am fine with the general goal and think the patch is close to get in, 
but I would like to give feedback on the interchange heuristic. I will 
try to review it today or tomorrow.


Thanks again!!

Tobias


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Carrot Wei
Hi Ramana

Thanks for the review, please see my inlined comments.

On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
ramana.radhakrish...@linaro.org wrote:

 On 8 June 2012 10:12, Carrot Wei car...@google.com wrote:
  Hi
 
  In rtl expression, substract a constant c is expressed as add a value -c, 
  so it
  is alse processed by adddi3, and I extend it more to handle a subtraction of
  64bit constant. I created an insn pattern arm_subdi3_immediate to 
  specifically
  represent substraction with 64bit constant while continue keeping the add 
  rtl
  expression.
 

 Sorry about the time it has taken to review this patch -Thanks for
 tackling this but I'm not convinced that this patch is correct and
 definitely can be more efficient.

 The range of valid 64 bit constants allowed would be in my opinion are
 the following- obtained by dividing the 64 bit constant into 2 32 bit
 halves (upper32 and lower32 referred to as upper and lower below)

  arm_not_operand (upper)  arm_add_operand (lower) which boils down
 to the valid combination of

  adds lo : adc hi - both positive constants.
  adds lo ; sbc hi  - lower positive, upper negative
I assume you mean sbc -hi or sbc abs(hi), similar for following instructions


  subs lo ; sbc hi - lower negative, upper negative
  subs lo ; adc hi  - lower negative, upper positive

My first version did the similar thing, but in some cases subs and
adds may generate different carry flag. Assume the low word is 0 and
high word is negative, your method will generate

adds r0, r0, 0
sbc   r1, r1, abs(hi)

My method generates

subs r0, r0, 0
sbc   r1, r1, abs(hi)

ARM's definition of subs is

(result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);

So the subs instruction will set carry flag, but adds clear carry
flag, and finally generate different result in r1.


 Therefore I'd do the following -

 * Don't make *arm_adddi3 a named pattern - we don't need that.
 * Change the *addsi3_carryin_optab pattern to be something like this :

 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -1001,12 +1001,14 @@
  )

  (define_insn *addsi3_carryin_optab
 -  [(set (match_operand:SI 0 s_register_operand =r)
 -       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r)
 -                         (match_operand:SI 2 arm_rhs_operand rI))
 +  [(set (match_operand:SI 0 s_register_operand =r,r)
 +       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r
 +                         (match_operand:SI 2 arm_not_operand rI,K

Do you mean arm_add_operand?

                 (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0]
   TARGET_32BIT
 -  adc%?\\t%0, %1, %2
 +  @
 +  adc%?\\t%0, %1, %2
 +  sbc%?\\t%0, %1, %#n2
   [(set_attr conds use)]
  )

 * I'd like a new const_ok_for_dimode_op function that dealt with each
 of these operations, thus your plus operation with a DImode constant
 would just be a check similar to what I've said above.

Good idea, it will make the interface cleaner. I will do it later.

 * You then don't need the new subdi3_immediate pattern and the split
 can happen after reload. Adjust predicates and constraints
 accordingly, delete it. Also please use CONST_INT_P instead of

Even if I delete subdi3_immediate pattern, we still need the
predicates and constraints to represent the negative di numbers in
other patterns.

thanks
Carrot


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Christophe Lyon

On 28.06.2012 09:32, Alexandre Oliva wrote:

I suggest changing both occurrences of $CFLAGS within the case
statements, then; the more uniform logic is more appealing to me.

Patch approved with these changes.

Thanks,


Thanks; here is an updated version taking your comment into account.

Can you commit it for me (I don't have write access).

Thanks.

Christophe.

2012-06-28  Christophe Lyon christophe.l...@st.com

* configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure
they contain -O2.
* configure: Regenerate.

diff --git a/configure b/configure
index 083f2ce..1ab12db 100755
--- a/configure
+++ b/configure
@@ -6690,11 +6690,11 @@ if test x$CFLAGS_FOR_TARGET = x; then
   CFLAGS_FOR_TARGET=$CFLAGS
   case  $CFLAGS  in
 * -O2 *) ;;
-*) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;;
+*) CFLAGS_FOR_TARGET=-O2 $CFLAGS_FOR_TARGET ;;
   esac
   case  $CFLAGS  in
 * -g * | * -g3 *) ;;
-*) CFLAGS_FOR_TARGET=-g $CFLAGS ;;
+*) CFLAGS_FOR_TARGET=-g $CFLAGS_FOR_TARGET ;;
   esac
 fi
 
@@ -6703,11 +6703,11 @@ if test x$CXXFLAGS_FOR_TARGET = x; then
   CXXFLAGS_FOR_TARGET=$CXXFLAGS
   case  $CXXFLAGS  in
 * -O2 *) ;;
-*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS ;;
+*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS_FOR_TARGET ;;
   esac
   case  $CXXFLAGS  in
 * -g * | * -g3 *) ;;
-*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS ;;
+*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS_FOR_TARGET ;;
   esac
 fi
 
diff --git a/configure.ac b/configure.ac
index 378e9f5..82dbe4c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2145,11 +2145,11 @@ if test x$CFLAGS_FOR_TARGET = x; then
   CFLAGS_FOR_TARGET=$CFLAGS
   case  $CFLAGS  in
 * -O2 *) ;;
-*) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;;
+*) CFLAGS_FOR_TARGET=-O2 $CFLAGS_FOR_TARGET ;;
   esac
   case  $CFLAGS  in
 * -g * | * -g3 *) ;;
-*) CFLAGS_FOR_TARGET=-g $CFLAGS ;;
+*) CFLAGS_FOR_TARGET=-g $CFLAGS_FOR_TARGET ;;
   esac
 fi
 AC_SUBST(CFLAGS_FOR_TARGET)
@@ -2158,11 +2158,11 @@ if test x$CXXFLAGS_FOR_TARGET = x; then
   CXXFLAGS_FOR_TARGET=$CXXFLAGS
   case  $CXXFLAGS  in
 * -O2 *) ;;
-*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS ;;
+*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS_FOR_TARGET ;;
   esac
   case  $CXXFLAGS  in
 * -g * | * -g3 *) ;;
-*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS ;;
+*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS_FOR_TARGET ;;
   esac
 fi
 AC_SUBST(CXXFLAGS_FOR_TARGET)


Re: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote:
 diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
 100644
 --- a/gcc/loop-init.c
 +++ b/gcc/loop-init.c
 @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
  static bool
  gate_rtl_move_loop_invariants (void)
  {
 +  /* In general, invariant motion can not reduce code size. But it
 + will
 +     change the liverange of the invariant, which increases the
 + register
 +     pressure and might lead to more spilling.  */
 +  if (optimize_function_for_size_p (cfun))
 +    return false;
 +

Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

 Update it according to the comments.

 Thanks!
 -Zhenqiang

 diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
 index f8405dd..b0e84a7 100644
 --- a/gcc/loop-invariant.c
 +++ b/gcc/loop-invariant.c
 @@ -1931,7 +1931,8 @@ move_loop_invariants (void)
       curr_loop = loop;
       /* move_single_loop_invariants for very large loops
         is time consuming and might need a lot of memory.  */
 -      if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
 +      if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
 +          ! optimize_loop_nest_for_size_p (loop))
        move_single_loop_invariants (loop);

Wait - move_single_loop_invariants itself already uses
optimize_loop_for_speed_p.
And looking down it seems to have support for tracking spill cost (eventually
only with -fira-loop-pressure) - please work out why this support is not working
for you.

Richard.

     }

 ChangeLog:
 2012-06-28  Zhenqiang Chen zhenqiang.c...@arm.com

        * loop-invariant.c (move_loop_invariants): Skip
        move_single_loop_invariants when optimizing loop for size






Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Ramana Radhakrishnan
On 28 June 2012 10:03, Carrot Wei car...@google.com wrote:
 Hi Ramana

 Thanks for the review, please see my inlined comments.

 On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:

 On 8 June 2012 10:12, Carrot Wei car...@google.com wrote:
  Hi
 
  In rtl expression, substract a constant c is expressed as add a value -c, 
  so it
  is alse processed by adddi3, and I extend it more to handle a subtraction 
  of
  64bit constant. I created an insn pattern arm_subdi3_immediate to 
  specifically
  represent substraction with 64bit constant while continue keeping the add 
  rtl
  expression.
 

 Sorry about the time it has taken to review this patch -Thanks for
 tackling this but I'm not convinced that this patch is correct and
 definitely can be more efficient.

 The range of valid 64 bit constants allowed would be in my opinion are
 the following- obtained by dividing the 64 bit constant into 2 32 bit
 halves (upper32 and lower32 referred to as upper and lower below)

  arm_not_operand (upper)  arm_add_operand (lower) which boils down
 to the valid combination of

  adds lo : adc hi - both positive constants.
  adds lo ; sbc hi  - lower positive, upper negative

 I assume you mean sbc -hi or sbc abs(hi), similar for following 
 instructions

hi = ~upper32

lower = lower 32 bits of the constant
hi =  ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) )

For e.g.

unsigned long long foo4 (unsigned long long x)
{
 return x - 0x25ULL;
}

should be
subs r0, r0, #37
sbc   r1, r1, #0

Notice that it's #0 and not 1 . :)





  subs lo ; sbc hi - lower negative, upper negative
  subs lo ; adc hi  - lower negative, upper positive

 My first version did the similar thing, but in some cases subs and
 adds may generate different carry flag. Assume the low word is 0 and
 high word is negative, your method will generate

 adds r0, r0, 0
 sbc   r1, r1, abs(hi)

No it will generate

adds r0, r0, #0
sbcr1, r1, ~hi

and not abs (hi)




 My method generates

 subs r0, r0, 0
 sbc   r1, r1, abs(hi)

 ARM's definition of subs is

 (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);

 So the subs instruction will set carry flag, but adds clear carry
 flag, and finally generate different result in r1.


 Therefore I'd do the following -

 * Don't make *arm_adddi3 a named pattern - we don't need that.
 * Change the *addsi3_carryin_optab pattern to be something like this :

 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -1001,12 +1001,14 @@
  )

  (define_insn *addsi3_carryin_optab
 -  [(set (match_operand:SI 0 s_register_operand =r)
 -       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r)
 -                         (match_operand:SI 2 arm_rhs_operand rI))
 +  [(set (match_operand:SI 0 s_register_operand =r,r)
 +       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r
 +                         (match_operand:SI 2 arm_not_operand rI,K

 Do you mean arm_add_operand?

No I mean arm_not_operand and it was a deliberate choice as explained above.


                 (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0]
   TARGET_32BIT
 -  adc%?\\t%0, %1, %2
 +  @
 +  adc%?\\t%0, %1, %2
 +  sbc%?\\t%0, %1, %#n2
   [(set_attr conds use)]
  )

 * I'd like a new const_ok_for_dimode_op function that dealt with each
 of these operations, thus your plus operation with a DImode constant
 would just be a check similar to what I've said above.

 Good idea, it will make the interface cleaner. I will do it later.

I think it should help with a clean interface for all the operations
you plan to add.


 * You then don't need the new subdi3_immediate pattern and the split
 can happen after reload. Adjust predicates and constraints
 accordingly, delete it. Also please use CONST_INT_P instead of

 Even if I delete subdi3_immediate pattern, we still need the
 predicates and constraints to represent the negative di numbers in
 other patterns.

I agree you need the predicate - I suspect you can get away with a
single constraint for all valid add immediate DImode operands
especially if you are splitting it later to the constituent forms.



regards,
Ramana



 thanks
 Carrot


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-28 Thread Iain Buclaw
On 27 June 2012 19:17, Mike Stump mikest...@comcast.net wrote:
 On Jun 27, 2012, at 7:45 AM, Iain Buclaw wrote:
 I do have a question though, what is available for the transition of
 development from git to svn?  Other than a lot of ready and getting
 used to the various switches and commands on my part.

 Why transition?  Quite a few people around here use git on a day to day basis 
 and just push and pull to/from svn as they see fit.  gcc has a read-only git 
 repo you can track and pull from.  For pushing into svn, you can use git to 
 do that as well (dcommit).  You'll want to read up on work flows on the 
 net... as dcommit and merges require a little extra caution that isn't 
 obvious.


I did not know of this, thanks. I'll be sure to look it up.


-- 
Iain Buclaw

*(p  e ? p++ : p) = (c  0x0f) + '0';


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Richard Earnshaw
On 28/06/12 10:03, Carrot Wei wrote:
 Hi Ramana
 
 Thanks for the review, please see my inlined comments.
 
 On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:

 On 8 June 2012 10:12, Carrot Wei car...@google.com wrote:
 Hi

 In rtl expression, substract a constant c is expressed as add a value -c, 
 so it
 is alse processed by adddi3, and I extend it more to handle a subtraction of
 64bit constant. I created an insn pattern arm_subdi3_immediate to 
 specifically
 represent substraction with 64bit constant while continue keeping the add 
 rtl
 expression.


 Sorry about the time it has taken to review this patch -Thanks for
 tackling this but I'm not convinced that this patch is correct and
 definitely can be more efficient.

 The range of valid 64 bit constants allowed would be in my opinion are
 the following- obtained by dividing the 64 bit constant into 2 32 bit
 halves (upper32 and lower32 referred to as upper and lower below)

  arm_not_operand (upper)  arm_add_operand (lower) which boils down
 to the valid combination of

  adds lo : adc hi - both positive constants.
  adds lo ; sbc hi  - lower positive, upper negative
 I assume you mean sbc -hi or sbc abs(hi), similar for following 
 instructions
 

No, it's sbc ~hi -- bitwise inversion

It all falls out from the specification, where

adc == X + Y + C
and
sbc == X + ~Y + C.

Hence the need to use arm_not_operand.

R.



Re: [patch] support for multiarch systems

2012-06-28 Thread Thomas Schwinge
Hi!

On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com wrote:
 On 25.06.2012 15:56, Joseph S. Myers wrote:
  On Mon, 25 Jun 2012, Matthias Klose wrote:
  
  Please find attached the patch updated for trunk 20120625, x86 only, 
  tested on
  x86-linux-gnu, KFreeBSD and the Hurd.

 2012-06-25  Matthias Klose  d...@ubuntu.com
 
   * doc/invoke.texi: Document -print-multiarch.
   * doc/install.texi: Document --enable-multiarch.
   * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME.
   * configure.ac: Add --enable-multiarch option.
   * configure.in: Regenerate.
   * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib.
   enable_multiarch, with_float: New macros.
   if_multiarch: New macro, define in terms of enable_multiarch.
   * genmultilib: Add new argument for the multiarch name.
   * gcc.c (multiarch_dir): Define.
   (for_each_path): Search for multiarch suffixes.
   (driver_handle_option): Handle multiarch option.
   (do_spec_1): Pass -imultiarch if defined.
   (main): Print multiarch.
   (set_multilib_dir): Separate multilib and multiarch names
   from multilib_select.
   (print_multilib_info): Ignore multiarch names in multilib_select.
   * incpath.c (add_standard_paths): Search the multiarch include dirs.
   * cppdeault.h (default_include): Document multiarch in multilib
   member.
   * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an
 include directory for multiarch directories.
   * common.opt: New options --print-multiarch and -imultilib.
   * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd
   for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu
   for i[34567]86-*-gnu*).
   * config/i386/t-kfreebsd: Add multiarch names in
   MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME.
   * config/i386/t-linux64: Likewise.
   * config/i386/t-linux: Define MULTIARCH_DIRNAME.
   * config/i386/t-gnu: Likewise.

As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files.
Instead of repeating: my comments from
http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E
as well as the follow-up still hold.

 Index: genmultilib
 ===
 --- genmultilib   (revision 188931)
 +++ genmultilib   (working copy)
 @@ -84,6 +84,8 @@
  # This argument can be used together with MULTILIB_EXCEPTIONS and will take
  # effect after the MULTILIB_EXCEPTIONS.
  
 +# The optional eight argument is the multiarch name.

»ninth argument«.


Grüße,
 Thomas


pgpZRoJXMiArK.pgp
Description: PGP signature


Re: [C++ RFC / Patch] PR 51213 (access control under SFINAE)

2012-06-28 Thread Paolo Carlini

On 06/15/2012 04:27 PM, Paolo Carlini wrote:

Hi,

as I mentioned a few days ago, I'm working on implementing this
feature, which I personally consider rather high priority, from the
library point of view too (eg, type_traits).

I have been making some progress - I'm attaching below what I have so
far in my local tree - but I also think it's time to get feedback both
about the general approach and about more specific issues with the
testsuite.

... any comments on this?

Thanks!
Paolo.



Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Eric Botcazou
 2012-06-26  Yannick Moy  m...@adacore.com

   * sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a
   postcondition, and issue an error in such a case.

This has introduced the following failures in the gnat.dg testsuite:

FAIL: gnat.dg/deep_old.adb (test for excess errors)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 7)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 16)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 28)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 34)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 38)
FAIL: gnat.dg/old_errors.adb  (test for warnings, line 40)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 44)
FAIL: gnat.dg/old_errors.adb (test for excess errors)

What should we do about them?

-- 
Eric Botcazou


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Arnaud Charlet
  * sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a
  postcondition, and issue an error in such a case.
 
 This has introduced the following failures in the gnat.dg testsuite:
 
 FAIL: gnat.dg/deep_old.adb (test for excess errors)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 7)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 16)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 28)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 34)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 38)
 FAIL: gnat.dg/old_errors.adb  (test for warnings, line 40)
 FAIL: gnat.dg/old_errors.adb  (test for errors, line 44)
 FAIL: gnat.dg/old_errors.adb (test for excess errors)
 
 What should we do about them?

Probably suppress both, since they no longer make sense (they are testing
an early implementation of 'Old, before 'Old was standardized in Ada 2012).

I'll take care of it.

Arno


Re: [PATCH] Add generic vector lowering for integer division and modulus (PR tree-optimization/53645)

2012-06-28 Thread Richard Guenther
On Wed, 27 Jun 2012, Jakub Jelinek wrote:

 Hi!
 
 This patch makes veclower2 attempt to emit integer division/modulus of
 vectors by constants using vector multiplication, shifts or masking.
 
 It is somewhat similar to the vect_recog_divmod_pattern, but it needs
 to analyze everything first, see if all divisions or modulos are doable
 using the same sequence of vector insns, and then emit vector insns
 as opposed to the scalar ones the pattern recognizer adds.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

I wonder what to do for -O0 though - shouldn't we not call
expand_vector_divmod in that case?  Thus,

+ if (!optimize
  || !VECTOR_INTEGER_TYPE_P (type) || TREE_CODE (rhs2) != 
VECTOR_CST)
+   break;

?

Thanks,
Richard.

 The testcase additionally eyeballed even for -mavx2, which unlike -mavx
 has vector  vector shifts.
 
 2012-06-27  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/53645
   * tree-vect-generic.c (add_rshift): New function.
   (expand_vector_divmod): New function.
   (expand_vector_operation): Use it for vector integer
   TRUNC_{DIV,MOD}_EXPR by VECTOR_CST.
   * tree-vect-patterns.c (vect_recog_divmod_pattern): Replace
   unused lguup variable with dummy_int.
 
   * gcc.c-torture/execute/pr53645.c: New test.
 
 --- gcc/tree-vect-generic.c.jj2012-06-26 10:00:42.935832834 +0200
 +++ gcc/tree-vect-generic.c   2012-06-27 10:15:20.534103045 +0200
 @@ -391,6 +391,515 @@ expand_vector_comparison (gimple_stmt_it
return t;
  }
  
 +/* Helper function of expand_vector_divmod.  Gimplify a RSHIFT_EXPR in type
 +   of OP0 with shift counts in SHIFTCNTS array and return the temporary 
 holding
 +   the result if successful, otherwise return NULL_TREE.  */
 +static tree
 +add_rshift (gimple_stmt_iterator *gsi, tree type, tree op0, int *shiftcnts)
 +{
 +  optab op;
 +  unsigned int i, nunits = TYPE_VECTOR_SUBPARTS (type);
 +  bool scalar_shift = true;
 +
 +  for (i = 1; i  nunits; i++)
 +{
 +  if (shiftcnts[i] != shiftcnts[0])
 + scalar_shift = false;
 +}
 +
 +  if (scalar_shift  shiftcnts[0] == 0)
 +return op0;
 +
 +  if (scalar_shift)
 +{
 +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);
 +  if (op != NULL
 +optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
 + return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
 + build_int_cst (NULL_TREE, shiftcnts[0]));
 +}
 +
 +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
 +  if (op != NULL
 +   optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
 +{
 +  tree *vec = XALLOCAVEC (tree, nunits);
 +  for (i = 0; i  nunits; i++)
 + vec[i] = build_int_cst (TREE_TYPE (type), shiftcnts[i]);
 +  return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
 +   build_vector (type, vec));
 +}
 +
 +  return NULL_TREE;
 +}
 +
 +/* Try to expand integer vector division by constant using
 +   widening multiply, shifts and additions.  */
 +static tree
 +expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0,
 +   tree op1, enum tree_code code)
 +{
 +  bool use_pow2 = true;
 +  bool has_vector_shift = true;
 +  int mode = -1, this_mode;
 +  int pre_shift = -1, post_shift;
 +  unsigned int nunits = TYPE_VECTOR_SUBPARTS (type);
 +  int *shifts = XALLOCAVEC (int, nunits * 4);
 +  int *pre_shifts = shifts + nunits;
 +  int *post_shifts = pre_shifts + nunits;
 +  int *shift_temps = post_shifts + nunits;
 +  unsigned HOST_WIDE_INT *mulc = XALLOCAVEC (unsigned HOST_WIDE_INT, nunits);
 +  int prec = TYPE_PRECISION (TREE_TYPE (type));
 +  int dummy_int;
 +  unsigned int i, unsignedp = TYPE_UNSIGNED (TREE_TYPE (type));
 +  unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type)));
 +  optab op;
 +  tree *vec;
 +  unsigned char *sel;
 +  tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem;
 +
 +  if (prec  HOST_BITS_PER_WIDE_INT)
 +return NULL_TREE;
 +
 +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
 +  if (op == NULL
 +  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
 +has_vector_shift = false;
 +
 +  /* Analysis phase.  Determine if all op1 elements are either power
 + of two and it is possible to expand it using shifts (or for remainder
 + using masking).  Additionally compute the multiplicative constants
 + and pre and post shifts if the division is to be expanded using
 + widening or high part multiplication plus shifts.  */
 +  for (i = 0; i  nunits; i++)
 +{
 +  tree cst = VECTOR_CST_ELT (op1, i);
 +  unsigned HOST_WIDE_INT ml;
 +
 +  if (!host_integerp (cst, unsignedp) || integer_zerop (cst))
 + return NULL_TREE;
 +  pre_shifts[i] = 0;
 +  post_shifts[i] = 0;
 +  mulc[i] = 0;
 +  if (use_pow2
 +(!integer_pow2p (cst) || tree_int_cst_sgn (cst) != 1))
 + 

Re: [patch] support for multiarch systems

2012-06-28 Thread Matthias Klose
On 28.06.2012 12:01, Thomas Schwinge wrote:
 Hi!
 
 On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com
 wrote:
 On 25.06.2012 15:56, Joseph S. Myers wrote:
 On Mon, 25 Jun 2012, Matthias Klose wrote:
 
 Please find attached the patch updated for trunk 20120625, x86 only,
 tested on x86-linux-gnu, KFreeBSD and the Hurd.
 
 2012-06-25  Matthias Klose  d...@ubuntu.com
 
 * doc/invoke.texi: Document -print-multiarch. * doc/install.texi:
 Document --enable-multiarch. * doc/fragments.texi: Document
 MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add
 --enable-multiarch option. * configure.in: Regenerate. * Makefile.in
 (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch,
 with_float: New macros. if_multiarch: New macro, define in terms of
 enable_multiarch. * genmultilib: Add new argument for the multiarch
 name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for
 multiarch suffixes. (driver_handle_option): Handle multiarch option. 
 (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. 
 (set_multilib_dir): Separate multilib and multiarch names from
 multilib_select. (print_multilib_info): Ignore multiarch names in
 multilib_select. * incpath.c (add_standard_paths): Search the multiarch
 include dirs. * cppdeault.h (default_include): Document multiarch in
 multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR,
 STANDARD_INCLUDE_DIR] Add an include directory for multiarch
 directories. * common.opt: New options --print-multiarch and -imultilib. 
 * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for
 i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for
 i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in 
 MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64:
 Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. *
 config/i386/t-gnu: Likewise.
 
 As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. 
 Instead of repeating: my comments from 
 http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E

 
as well as the follow-up still hold.

Like

* config/i386/t-gnu: New, define MULTIARCH_DIRNAME.

?

 Index: genmultilib 
 === ---
 genmultilib  (revision 188931) +++ genmultilib   (working copy) @@ -84,6
 +84,8 @@ # This argument can be used together with MULTILIB_EXCEPTIONS
 and will take # effect after the MULTILIB_EXCEPTIONS.
 
 +# The optional eight argument is the multiarch name.
 
 »ninth argument«.

fixed.


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Gerald Pfeifer
On Thu, 28 Jun 2012, Andreas Schwab wrote:
 libgomp.texi is still using gpl.texi, although libgomp has been
 relicensed to GPLv3 in 2009.  OK?

Looks good, thank you.

 (This is the last use of gpl.texi in the gcc sources.  Perhaps it
 should be removed and gpl_v3.texi renamed back to gpl.texi?)

If it's not used any more, yes, please go ahead an remove it.

As for renaming gpl_v3.texi to gpl.texi, I'm not sure.

Gerald


Re: [patch] support for multiarch systems

2012-06-28 Thread Thomas Schwinge
Hi!

On Thu, 28 Jun 2012 12:42:23 +0200, Matthias Klose d...@ubuntu.com wrote:
 On 28.06.2012 12:01, Thomas Schwinge wrote:
  On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com
  wrote:
  On 25.06.2012 15:56, Joseph S. Myers wrote:
  On Mon, 25 Jun 2012, Matthias Klose wrote:
  
  Please find attached the patch updated for trunk 20120625, x86 only,
  tested on x86-linux-gnu, KFreeBSD and the Hurd.
  
  2012-06-25  Matthias Klose  d...@ubuntu.com
  
  * doc/invoke.texi: Document -print-multiarch. * doc/install.texi:
  Document --enable-multiarch. * doc/fragments.texi: Document
  MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add
  --enable-multiarch option. * configure.in: Regenerate. * Makefile.in
  (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch,
  with_float: New macros. if_multiarch: New macro, define in terms of
  enable_multiarch. * genmultilib: Add new argument for the multiarch
  name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for
  multiarch suffixes. (driver_handle_option): Handle multiarch option. 
  (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. 
  (set_multilib_dir): Separate multilib and multiarch names from
  multilib_select. (print_multilib_info): Ignore multiarch names in
  multilib_select. * incpath.c (add_standard_paths): Search the multiarch
  include dirs. * cppdeault.h (default_include): Document multiarch in
  multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR,
  STANDARD_INCLUDE_DIR] Add an include directory for multiarch
  directories. * common.opt: New options --print-multiarch and -imultilib. 
  * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for
  i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for
  i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in 
  MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64:
  Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. *
  config/i386/t-gnu: Likewise.
  
  As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. 
  Instead of repeating: my comments from 
  http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E
 
  
 as well as the follow-up still hold.
 
 Like
 
   * config/i386/t-gnu: New, define MULTIARCH_DIRNAME.
 
 ?

I'd use:

* config/i386/t-gnu: New file.
* config/i386/t-kfreebsd: Likewise.
* config/i386/t-linux: Likewise.

Plus the following instead of your changes:

gcc/
* config.gcc i[34567]86-*-linux* | x86_64-*-linux* (tmake_file):
Include i386/t-linux.
i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu (tmake_file):
Include i386/t-kfreebsd.
i[34567]86-*-gnu* (tmake_file): Include i386/t-gnu.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7ec184c..39c70f2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3481,9 +3481,14 @@ case ${target} in
 
i[34567]86-*-darwin* | x86_64-*-darwin*)
;;
-   i[34567]86-*-linux* | x86_64-*-linux* | \
- i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
- i[34567]86-*-gnu*)
+   i[34567]86-*-linux* | x86_64-*-linux*)
+   tmake_file=$tmake_file i386/t-linux
+   ;;
+   i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu)
+   tmake_file=$tmake_file i386/t-kfreebsd
+   ;;
+   i[34567]86-*-gnu*)
+   tmake_file=$tmake_file i386/t-gnu
;;
i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*)
;;

Otherwise, I can't imagine how that would work.


Grüße,
 Thomas


pgpJmqjTH8LJD.pgp
Description: PGP signature


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Carrot Wei
On Thu, Jun 28, 2012 at 5:37 PM, Ramana Radhakrishnan
ramana.radhakrish...@linaro.org wrote:
 On 28 June 2012 10:03, Carrot Wei car...@google.com wrote:
 Hi Ramana

 Thanks for the review, please see my inlined comments.

 On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:

 On 8 June 2012 10:12, Carrot Wei car...@google.com wrote:
  Hi
 
  In rtl expression, substract a constant c is expressed as add a value -c, 
  so it
  is alse processed by adddi3, and I extend it more to handle a subtraction 
  of
  64bit constant. I created an insn pattern arm_subdi3_immediate to 
  specifically
  represent substraction with 64bit constant while continue keeping the add 
  rtl
  expression.
 

 Sorry about the time it has taken to review this patch -Thanks for
 tackling this but I'm not convinced that this patch is correct and
 definitely can be more efficient.

 The range of valid 64 bit constants allowed would be in my opinion are
 the following- obtained by dividing the 64 bit constant into 2 32 bit
 halves (upper32 and lower32 referred to as upper and lower below)

  arm_not_operand (upper)  arm_add_operand (lower) which boils down
 to the valid combination of

  adds lo : adc hi - both positive constants.
  adds lo ; sbc hi  - lower positive, upper negative

 I assume you mean sbc -hi or sbc abs(hi), similar for following 
 instructions

 hi = ~upper32

 lower = lower 32 bits of the constant
 hi =  ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) )

 For e.g.

 unsigned long long foo4 (unsigned long long x)
 {
  return x - 0x25ULL;
 }

 should be
 subs r0, r0, #37
 sbc   r1, r1, #0

 Notice that it's #0 and not 1 . :)





  subs lo ; sbc hi - lower negative, upper negative
  subs lo ; adc hi  - lower negative, upper positive


Thank you for the detailed explanation. So the four cases should be

 adds lo : adc hi - both positive constants.
 adds lo ; sbc ~hi  - lower positive, upper negative
 subs -lo ; sbc ~hi - lower negative, upper negative
 subs -lo ; adc hi  - lower negative, upper positive


 My first version did the similar thing, but in some cases subs and
 adds may generate different carry flag. Assume the low word is 0 and
 high word is negative, your method will generate

 adds r0, r0, 0
 sbc   r1, r1, abs(hi)

 No it will generate

 adds r0, r0, #0
 sbc    r1, r1, ~hi

 and not abs (hi)




 My method generates

 subs r0, r0, 0
 sbc   r1, r1, abs(hi)

 ARM's definition of subs is

 (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);

 So the subs instruction will set carry flag, but adds clear carry
 flag, and finally generate different result in r1.


 Therefore I'd do the following -

 * Don't make *arm_adddi3 a named pattern - we don't need that.
 * Change the *addsi3_carryin_optab pattern to be something like this :

 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -1001,12 +1001,14 @@
  )

  (define_insn *addsi3_carryin_optab
 -  [(set (match_operand:SI 0 s_register_operand =r)
 -       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r)
 -                         (match_operand:SI 2 arm_rhs_operand rI))
 +  [(set (match_operand:SI 0 s_register_operand =r,r)
 +       (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r
 +                         (match_operand:SI 2 arm_not_operand rI,K

 Do you mean arm_add_operand?

 No I mean arm_not_operand and it was a deliberate choice as explained above.


                 (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0]
   TARGET_32BIT
 -  adc%?\\t%0, %1, %2
 +  @
 +  adc%?\\t%0, %1, %2
 +  sbc%?\\t%0, %1, %#n2

Since constraint K is logical not, not negative, should the last
line be following?

+  sbc%?\\t%0, %1, #%B2

thanks
Carrot


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote:

 On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
 I'd very be surprised if I asked for an i686 native build to package and
 install elsewhere, and didn't get a plugin just because the build-time
 linker wouldn't have been able to run the plugin.

 Not disable plugin support altogether, but disable assuming the linker
 supports the plugin.

That still doesn't sound right to me: why should the compiler refrain
from using a perfectly functional linker plugin on the machine where
it's installed (not where it's built)?

Also, this scenario of silently deciding whether or not to use the
linker plugin could bring us to different test results for the same
command lines.  I don't like that.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[PATCH] Fix PR53790

2012-06-28 Thread Richard Guenther

This fixes PR53790 - with MEM_REF you can get base decls of
incomplete type.  Deal with that.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied
everywhere.

Richard.

2012-06-28  Richard Guenther  rguent...@suse.de

PR middle-end/53790
* expr.c (expand_expr_real_1): Verify if the type is complete
before inspecting its size.

* gcc.dg/torture/pr53790.c: New testcase.

Index: gcc/expr.c
===
*** gcc/expr.c  (revision 189041)
--- gcc/expr.c  (working copy)
*** expand_expr_real_1 (tree exp, rtx target
*** 9832,9837 
--- 9832,9838 
orig_op0 = op0
  = expand_expr (tem,
 (TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE
+  COMPLETE_TYPE_P (TREE_TYPE (tem))
   (TREE_CODE (TYPE_SIZE (TREE_TYPE (tem)))
  != INTEGER_CST)
   modifier != EXPAND_STACK_PARM
Index: gcc/testsuite/gcc.dg/torture/pr53790.c
===
*** gcc/testsuite/gcc.dg/torture/pr53790.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr53790.c  (working copy)
***
*** 0 
--- 1,17 
+ /* { dg-do compile } */
+ 
+ typedef struct s {
+ int value;
+ } s_t;
+ 
+ static inline int 
+ read(s_t const *var)
+ {
+   return var-value;
+ }
+ 
+ int main()
+ {
+   extern union u extern_var;
+   return read((s_t *)extern_var);
+ }


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Andreas Schwab
Gerald Pfeifer ger...@pfeifer.com writes:

 If it's not used any more, yes, please go ahead an remove it.

Done as this, tested with make info.

Andreas.

* doc/include/gpl.texi: Remove.
* doc/sourcebuild.texi (Texinfo Manuals): Don't mention gpl.texi.

diff --git a/gcc/doc/include/gpl.texi b/gcc/doc/include/gpl.texi
deleted file mode 100644
index bcb5535..000
[omitted]
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 3d834ee..dc5cc47 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1,4 +1,4 @@
-@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
+@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012
 @c Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.
@@ -368,8 +368,7 @@ The GNU Free Documentation License.
 The section ``Funding Free Software''.
 @item gcc-common.texi
 Common definitions for manuals.
-@item gpl.texi
-@itemx gpl_v3.texi
+@item gpl_v3.texi
 The GNU General Public License.
 @item texinfo.tex
 A copy of @file{texinfo.tex} known to work with the GCC manuals.
-- 
1.7.11.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 1:39 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote:

 On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
 I'd very be surprised if I asked for an i686 native build to package and
 install elsewhere, and didn't get a plugin just because the build-time
 linker wouldn't have been able to run the plugin.

 Not disable plugin support altogether, but disable assuming the linker
 supports the plugin.

 That still doesn't sound right to me: why should the compiler refrain
 from using a perfectly functional linker plugin on the machine where
 it's installed (not where it's built)?

 Also, this scenario of silently deciding whether or not to use the
 linker plugin could bring us to different test results for the same
 command lines.  I don't like that.

I don't like that we derive the default setting this way either.  In the end
I would like us to arrive at the point that LTO does not work at all without
a linker plugin.

Richard.

 --
 Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist      Red Hat Brazil Compiler Engineer


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Ramana Radhakrishnan
  subs -lo ; sbc ~hi - lower negative, upper negative
  subs -lo ; adc hi  - lower negative, upper positive

Yes.

snip



                 (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0]
   TARGET_32BIT
 -  adc%?\\t%0, %1, %2
 +  @
 +  adc%?\\t%0, %1, %2
 +  sbc%?\\t%0, %1, %#n2

 Since constraint K is logical not, not negative, should the last
 line be following?

 +  sbc%?\\t%0, %1, #%B2

Indeed that was a typo on my part. Sorry about that.

Ramana


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Christophe Lyon christophe.l...@st.com wrote:

 Can you commit it for me (I don't have write access).

Done, GCC SVN and src CVS trees.  Thanks!

 2012-06-28  Christophe Lyon christophe.l...@st.com

   * configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure
   they contain -O2.
   * configure: Regenerate.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)

2012-06-28 Thread Georg-Johann Lay
This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if
!HARD_REGNO_MODE_OK.

Returning true for such registers might lead to performance
degradation that eat up all performance gained from 4.6 to 4.7
for example.

Ok to apply?

Johann

PR 53595
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
* config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
* config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
avr_hard_regno_call_part_clobbered.
Index: config/avr/avr-protos.h
===
--- config/avr/avr-protos.h	(revision 189011)
+++ config/avr/avr-protos.h	(working copy)
@@ -47,6 +47,7 @@ extern void init_cumulative_args (CUMULA
 #endif /* TREE_CODE */
 
 #ifdef RTX_CODE
+extern int avr_hard_regno_call_part_clobbered (unsigned, enum machine_mode);
 extern const char *output_movqi (rtx insn, rtx operands[], int *l);
 extern const char *output_movhi (rtx insn, rtx operands[], int *l);
 extern const char *output_movsisf (rtx insn, rtx operands[], int *l);
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 189011)
+++ config/avr/avr.c	(working copy)
@@ -8856,6 +8856,28 @@ avr_hard_regno_mode_ok (int regno, enum
 }
 
 
+/* Implement `HARD_REGNO_CALL_PART_CLOBBERED'.  */
+
+int
+avr_hard_regno_call_part_clobbered (unsigned regno, enum machine_mode mode)
+{
+  /* FIXME: This hook gets called with MODE:REGNO combinations that don't
+represent valid hard registers like, e.g. HI:29.  Returning TRUE
+for such registers can lead to performance degradation as mentioned
+in PR53595.  Thus, report invalid hard registers as FALSE.  */
+  
+  if (!avr_hard_regno_mode_ok (regno, mode))
+return 0;
+  
+  /* Return true if any of the following boundaries is crossed:
+ 17/18, 27/28 and 29/30.  */
+  
+  return ((regno  18  regno + GET_MODE_SIZE (mode)  18)
+  || (regno  REG_Y  regno + GET_MODE_SIZE (mode)  REG_Y)
+  || (regno  REG_Z  regno + GET_MODE_SIZE (mode)  REG_Z));
+}
+
+
 /* Implement `MODE_CODE_BASE_REG_CLASS'.  */
 
 enum reg_class
Index: config/avr/avr.h
===
--- config/avr/avr.h	(revision 189011)
+++ config/avr/avr.h	(working copy)
@@ -402,10 +402,8 @@ enum reg_class {
 
 #define REGNO_OK_FOR_INDEX_P(NUM) 0
 
-#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)\
-  (((REGNO)  18  (REGNO) + GET_MODE_SIZE (MODE)  18)   \
-   || ((REGNO)  REG_Y  (REGNO) + GET_MODE_SIZE (MODE)  REG_Y)  \
-   || ((REGNO)  REG_Z  (REGNO) + GET_MODE_SIZE (MODE)  REG_Z))
+#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \
+  avr_hard_regno_call_part_clobbered (REGNO, MODE)
 
 #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true
 


[PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets

2012-06-28 Thread Maciej W. Rozycki
Hello,

 This change adds soft-fp support for SDE bare-iron targets.

 The settings have been mostly based on the version already present in 
glibc, except that the ABI variations have been merged into a single file 
and conditionalised on preprocessor macros (and the file reformatted to 
follow the GNU coding standard that the glibc variants don't).  Only n32 
has to be treated somewhat specially as it is ILP32 but its long long 
type is 64-bit with native support (using single registers rather than 
pairs).  The rest is handled generically, based on the width of the types 
chosen.

 This has been regression tested for the mips-sde-elf target with no new 
failures, using the o32 and n64 ABI multilibs, with and without 
-msoft-float, o32 also with MIPS16 variants.

 There's currently no SDE runtime support for n32, however despite the 
unability to test I decided the configuration shouldn't be pessimised by 
default (by avoiding the special exception and using the 32-bit long 
type) as glibc already uses such an arrangement so it's been verified 
elsewhere and if a platform that supports the n32 ABI decides later on to 
enable soft-fp too, it will be verified in libgcc anyway.  I believe this 
is reasonable and avoids the risk of someone chooing the long type by 
omission.

 Comments or questions are welcome, otherwise OK to apply?

2012-06-28  Catherine Moore  c...@codesourcery.com
Maciej W. Rozycki  ma...@codesourcery.com

libgcc/
* config/mips/sfp-machine.h: New file.
* config.host mips*-sde-elf*: Enable soft-fp.

  Maciej

gcc-mips-softfp.diff
Index: gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h  2012-06-24 
14:38:40.083663725 +0100
@@ -0,0 +1,101 @@
+#if defined _ABIN32  _MIPS_SIM == _ABIN32
+
+#define _FP_W_TYPE_SIZE64
+#define _FP_W_TYPE unsigned long long
+#define _FP_WS_TYPEsigned long long
+#define _FP_I_TYPE long long
+
+#else
+
+#define _FP_W_TYPE_SIZE_MIPS_SZLONG
+#define _FP_W_TYPE unsigned long
+#define _FP_WS_TYPEsigned long
+#define _FP_I_TYPE long
+
+#endif
+
+#if _FP_W_TYPE_SIZE  64
+
+#define _FP_MUL_MEAT_S(R, X, Y)\
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_S, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_D(R, X, Y)\
+  _FP_MUL_MEAT_2_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y)\
+  _FP_MUL_MEAT_4_wide (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y)\
+  _FP_DIV_MEAT_1_udiv_norm (S, R, X, Y)
+#define _FP_DIV_MEAT_D(R, X, Y)\
+  _FP_DIV_MEAT_2_udiv (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y)\
+  _FP_DIV_MEAT_4_udiv (Q, R, X, Y)
+
+#else
+
+#define _FP_MUL_MEAT_S(R, X, Y)\
+  _FP_MUL_MEAT_1_imm (_FP_WFRACBITS_S, R, X, Y)
+#define _FP_MUL_MEAT_D(R, X, Y)\
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y)\
+  _FP_MUL_MEAT_2_wide_3mul (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y)\
+  _FP_DIV_MEAT_1_imm (S, R, X, Y, _FP_DIV_HELP_imm)
+#define _FP_DIV_MEAT_D(R, X, Y)\
+  _FP_DIV_MEAT_1_udiv_norm (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y)\
+  _FP_DIV_MEAT_2_udiv (Q, R, X, Y)
+
+#endif
+
+#define _FP_NANFRAC_S  ((_FP_QNANBIT_S  1) - 1)
+#define _FP_NANFRAC_D  ((_FP_QNANBIT_D  1) - 1), -1
+#define _FP_NANFRAC_Q  ((_FP_QNANBIT_Q  1) - 1), -1, -1, -1
+#define _FP_NANSIGN_S  0
+#define _FP_NANSIGN_D  0
+#define _FP_NANSIGN_Q  0
+
+#define _FP_KEEPNANFRACP 1
+/* From my experiments it seems X is chosen unless one of the
+   NaNs is sNaN,  in which case the result is NANSIGN/NANFRAC.  */
+#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \
+  do { \
+if ((_FP_FRAC_HIGH_RAW_##fs (X)\
+| _FP_FRAC_HIGH_RAW_##fs (Y))  _FP_QNANBIT_##fs)  \
+  {\
+   R##_s = _FP_NANSIGN_##fs;   \
+_FP_FRAC_SET_##wc (R, _FP_NANFRAC_##fs);   \
+  }\
+else   \
+  {

Re: [PATCH] Move Graphite from using PPL over to ISL

2012-06-28 Thread Diego Novillo

On 12-06-27 11:06 , Richard Guenther wrote:


2012-06-27  Richard Guenther  rguent...@suse.de
Michael Matz  m...@suse.de
Tobias Grosser tob...@grosser.es
Sebastian Pop seb...@gmail.com

config/
* cloog.m4: Set up to work against ISL only.
* isl.m4: New file.

* Makefile.def: Add ISL host module, remove PPL host module.
Adjust ClooG host module to use the proper ISL.
* Makefile.tpl: Pass ISL include flags instead of PPL ones.
* configure.ac: Include config/isl.m4.  Add ISL host library,
remove PPL.  Remove PPL configury, add ISL configury, adjust
ClooG configury.
* Makefile.in: Regenerated.
* configure: Likewise.

gcc/
* Makefile.in: Remove PPL flags in favor of ISL ones.
(BACKENDLIBS): Remove PPL libs.
(INCLUDES): Remove PPL includes in favor of ISL ones.
(graphite-clast-to-gimple.o): Remove graphite-dependences.h and
graphite-cloog-compat.h dependencies.
(graphite-dependences.o): Likewise.
(graphite-poly.o): Likewise.
* configure.ac: Declare ISL vars instead of PPL ones.
* configure: Regenerated.
* doc/install.texi: Replace PPL requirement documentation
with ISL one.
* graphite-blocking.c: Remove PPL code, add ISL equivalent.
* graphite-clast-to-gimple.c: Likewise.
* graphite-dependences.c: Likewise.
* graphite-interchange.c: Likewise.
* graphite-poly.h: Likewise.
* graphite-poly.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c: Likewise.
* graphite-scop-detection.c: Re-arrange includes.
* graphite-cloog-util.c: Remove.
* graphite-cloog-util.h: Likewise.
* graphite-ppl.h: Likewise.
* graphite-ppl.c: Likewise.
* graphite-dependences.h: Likewise.

libgomp/
* testsuite/libgomp.graphite/force-parallel-4.c: Adjust.
* testsuite/libgomp.graphite/force-parallel-5.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-7.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-8.c: Likewise.


OK.


Diego.




Re: [patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)

2012-06-28 Thread Denis Chertykov
2012/6/28 Georg-Johann Lay a...@gjlay.de:
 This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if
 !HARD_REGNO_MODE_OK.

 Returning true for such registers might lead to performance
 degradation that eat up all performance gained from 4.6 to 4.7
 for example.

 Ok to apply?

 Johann

        PR 53595
        * config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
        * config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
        * config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
        avr_hard_regno_call_part_clobbered.

Please, apply.

Denis


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann 
matthew.gretton-d...@arm.com wrote:
 On 27/06/12 21:35, Andrew Pinski wrote:
 On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
 matthew.gretton-d...@arm.com wrote:
 All,
 
 This patch enables the dump-noaddr test to work in out-of-build-tree
 testing.
 [snip]
 
 I created a much simpler patch which I have been meaning to submit.
 I attached it for reference.
 
 
 Thanks,
 Andrew Pinski
 
 ChangeLog:
 * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
 an absolute dump base instead of a relative one.
 
 Index: gcc.c-torture/unsorted/dump-noaddr.x
 ===
 --- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
 +++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
 @@ -11,10 +11,10 @@ proc dump_compare { src options } {
  foreach option $option_list {
  file delete -force dump1
  file mkdir dump1
 -c-torture-compile $src $option $options -dumpbase dump1/$dumpbase 
 -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all 
 -fdump-tree-all -fdump-noaddr
 +c-torture-compile $src $option $options -dumpbase 
 [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 
 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
  file delete -force dump2
  file mkdir dump2
 -c-torture-compile $src $option $options -dumpbase dump2/$dumpbase 
 -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
 +c-torture-compile $src $option $options -dumpbase 
 [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all 
 -fdump-tree-all -fdump-noaddr
  foreach dump1 [lsort [glob -nocomplain dump1/*]] {
  regsub dump1/ $dump1 dump2/ dump2
  set dumptail gcc.c-torture/unsorted/[file tail $dump1]
 
 What I don't like about this approach is that dump1 and dump2 are created in 
 the current working directory.

On vxworks as I recall we did a cd to tmpdir, is that generally true?  Also, if 
one telnets in or sshes into the host under test, the cd is mandatory... as 
otherwise one would dump turds (that's a technical term) in the home directory 
which would be very uncool.  Maybe a better approach would be to cd to the 
right place if all the Canadian setups cd, as that then unifies them.

 With out of build-tree testing this may not (I believe) be the same as 
 $tmpdir (where temporaries are normally created).  Also the current directory 
 may already contain directories/files called dump1 or dump2 which will get 
 destroyed by running the 

The point of the cd was to get to a place where temps can be created freely...

 I've not committed my version yet in case I am missing something in my 
 reasoning above with regards to the relationship between the current working 
 directory and $tmpdir.

So the question would be, does his patch work for you?  It was unclear to me if 
the answer is no.

Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the 
directory on the target, the host or the build machine?  And is that going to 
the host machine?  They are not the same.  One needs a directory on the host 
machine.


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Matthew Gretton-Dann

On 28/06/12 14:38, Mike Stump wrote:

On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann

 matthew.gretton-d...@arm.com wrote:

On 27/06/12 21:35, Andrew Pinski wrote:

On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
matthew.gretton-d...@arm.com wrote:

All,

This patch enables the dump-noaddr test to work in out-of-build-tree
testing.

[snip]


I created a much simpler patch which I have been meaning to submit.
I attached it for reference.


Thanks,
Andrew Pinski

ChangeLog:
* testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
an absolute dump base instead of a relative one.

Index: gcc.c-torture/unsorted/dump-noaddr.x
===
--- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
+++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
  foreach option $option_list {
  file delete -force dump1
  file mkdir dump1
-c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x 
c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr
+c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase 
-DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr
  file delete -force dump2
  file mkdir dump2
-c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x 
c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
+c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase 
-DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
  foreach dump1 [lsort [glob -nocomplain dump1/*]] {
  regsub dump1/ $dump1 dump2/ dump2
  set dumptail gcc.c-torture/unsorted/[file tail $dump1]


What I don't like about this approach is that dump1 and dump2 are

 created in the current working directory.


On vxworks as I recall we did a cd to tmpdir, is that generally true?
Also, if one telnets in or sshes into the host under test, the cd is
mandatory... as otherwise one would dump turds (that's a technical term)
in the home directory which would be very uncool.  Maybe a better
approach would be to cd to the right place if all the Canadian setups cd,
as that then unifies them.


With out of build-tree testing this may not (I believe) be the same as
$tmpdir (where temporaries are normally created).  Also the current
directory may already contain directories/files called dump1 or dump2
which will get destroyed by running the


The point of the cd was to get to a place where temps can be created
freely...


I've not committed my version yet in case I am missing something in my
reasoning above with regards to the relationship between the current
working directory and $tmpdir.


So the question would be, does his patch work for you?  It was unclear to
me if the answer is no.


Sorry - the patch works for my use case (build==host), but I was concerned 
over the use of [pwd] vs $tmpdir.



Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the
directory on the target, the host or the build machine?  And is that
going to the host machine?  They are not the same.  One needs a directory
on the host machine.


I don't think this applies to my patch though, so are you still okay for my 
version to go in or is there something else I haven't considered?


Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd


--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd




Re: [PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets

2012-06-28 Thread Joseph S. Myers
On Thu, 28 Jun 2012, Maciej W. Rozycki wrote:

   * config/mips/sfp-machine.h: New file.
   * config.host mips*-sde-elf*: Enable soft-fp.

The compiler uses MIPS NaN conventions on MIPS; fp-bit knows about those 
but soft-fp does not.  Are you not concerned about that regression?  (Is 
this code only ever going to be used in software floating-point 
configurations, without exception support, so the choice of NaN doesn't 
matter much?)

libgcc/config/mips/t-mips sets FPBIT and DPBIT.  Shouldn't you do 
something to override those settings?  Even if the libgcc logic is to 
build soft-fp if both soft-fp and fp-bit are configured, it would seem 
cleaner for the fragments to configure only the relevant one.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 12:16 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote:
 On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
 Why?  We don't demand a working plugin.  Indeed, we disable the use of
 the plugin if we find a linker that doesn't support it.  We just don't
 account for the possibility of finding a linker that supports plugins,
 but that doesn't support the one we'll build later.
 
 If this is the preferred solution, then having configure check the
 64-bitness of ld and turning off the plugin altogether on mismatches
 sounds like a reasonable course of action to me.
 
 I'd very be surprised if I asked for an i686 native build to package and
 install elsewhere, and didn't get a plugin just because the build-time
 linker wouldn't have been able to run the plugin.

The architecture of the compiler, last I knew it, was to smell out the feature 
set of the system, including libraries, headers, assemblers and linkers.  It 
uses this as static configuration parameters for the build.  One is not free to 
take the built compiler to a differently configured system at run time.

Now, with that as a backdrop, how exactly do you ever plan on using the plugin? 
 If there is no possible use for it, why then build it?

So, even if there is a way to toggle the feature on, which would mean the 
plug-in should be built, it should still be off initially, which it isn't.


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 4:39 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote:
 
 On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
 I'd very be surprised if I asked for an i686 native build to package and
 install elsewhere, and didn't get a plugin just because the build-time
 linker wouldn't have been able to run the plugin.
 
 Not disable plugin support altogether, but disable assuming the linker
 supports the plugin.
 
 That still doesn't sound right to me: why should the compiler refrain
 from using a perfectly functional linker plugin on the machine where
 it's installed (not where it's built?

See your point below for one reason.  The next would be because it would be a 
speed hit to re-check at runtime the qualities of the linker and do something 
different.  If the system had an architecture to avoid the speed hit and people 
wanted to do the work to support the runtime reconfigure, that'd be fine with 
me.  I don't think you system supports this, and I don't think you want to do 
that work, do you?

 Also, this scenario of silently deciding whether or not to use the
 linker plugin could bring us to different test results for the same
 command lines.  I don't like that.

Right, which is why the static configuration of the host system at build time 
is forever after an invariant.  The linker is smelled, it doesn't support 
plugins, therefore we can't ever use it, therefore we never build it...


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 09:17:55AM +0200, Jakub Jelinek wrote:
 I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and
 vectorizing it as either of the sequences next.

And here is corresponding pattern recognizer and vectorizer patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
to pessimize the generated code for gcc.dg/vect/pr51581-3.c
testcase (at least with -O3 -mavx) compared to when the hooks aren't
present, because i?86 has more natural support for widen mult lo/hi
compoared to widen mult even/odd, but I assume that on powerpc it is the
other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
and builtin_mul_widen_* are possible for the particular vectype which one
will be cheaper?

2012-06-28  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/51581
* tree-vect-stmts.c (permute_vec_elements): Add forward decl.
(vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR
also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus
VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Use
MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts.

* gcc.dg/vect/pr51581-4.c: New test.

--- gcc/tree-vect-stmts.c.jj2012-06-26 11:38:28.0 +0200
+++ gcc/tree-vect-stmts.c   2012-06-28 13:27:50.475158271 +0200
@@ -3288,6 +3288,10 @@ vectorizable_shift (gimple stmt, gimple_
 }
 
 
+static tree permute_vec_elements (tree, tree, tree, gimple,
+ gimple_stmt_iterator *);
+
+
 /* Function vectorizable_operation.
 
Check if STMT performs a binary, unary or ternary operation that can
@@ -3300,17 +3304,18 @@ static bool
 vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
gimple *vec_stmt, slp_tree slp_node)
 {
-  tree vec_dest;
+  tree vec_dest, vec_dest2 = NULL_TREE;
+  tree vec_dest3 = NULL_TREE, vec_dest4 = NULL_TREE;
   tree scalar_dest;
   tree op0, op1 = NULL_TREE, op2 = NULL_TREE;
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  tree vectype;
+  tree vectype, wide_vectype = NULL_TREE;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   enum tree_code code;
   enum machine_mode vec_mode;
   tree new_temp;
   int op_type;
-  optab optab;
+  optab optab, optab2 = NULL;
   int icode;
   tree def;
   gimple def_stmt;
@@ -3327,6 +3332,8 @@ vectorizable_operation (gimple stmt, gim
   tree vop0, vop1, vop2;
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
   int vf;
+  unsigned char *sel = NULL;
+  tree decl1 = NULL_TREE, decl2 = NULL_TREE, perm_mask = NULL_TREE;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info)  !bb_vinfo)
 return false;
@@ -3451,31 +3458,97 @@ vectorizable_operation (gimple stmt, gim
   optab = optab_for_tree_code (code, vectype, optab_default);
 
   /* Supportable by target?  */
-  if (!optab)
+  if (!optab  code != MULT_HIGHPART_EXPR)
 {
   if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, no optab.);
   return false;
 }
   vec_mode = TYPE_MODE (vectype);
-  icode = (int) optab_handler (optab, vec_mode);
+  icode = optab ? (int) optab_handler (optab, vec_mode) : CODE_FOR_nothing;
+
+  if (icode == CODE_FOR_nothing
+   code == MULT_HIGHPART_EXPR
+   VECTOR_MODE_P (vec_mode)
+   BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN)
+{
+  /* If MULT_HIGHPART_EXPR isn't supported by the backend, see
+if we can emit VEC_WIDEN_MULT_{LO,HI}_EXPR followed by VEC_PERM_EXPR
+or builtin_mul_widen_{even,odd} followed by VEC_PERM_EXPR.  */
+  unsigned int prec = TYPE_PRECISION (TREE_TYPE (scalar_dest));
+  unsigned int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scalar_dest));
+  tree wide_type
+   = build_nonstandard_integer_type (prec * 2, unsignedp);
+  wide_vectype
+= get_same_sized_vectype (wide_type, vectype);
+
+  sel = XALLOCAVEC (unsigned char, nunits_in);
+  if (VECTOR_MODE_P (TYPE_MODE (wide_vectype))
+  GET_MODE_SIZE (TYPE_MODE (wide_vectype))
+== GET_MODE_SIZE (vec_mode))
+   {
+ if (targetm.vectorize.builtin_mul_widen_even
+  (decl1 = targetm.vectorize.builtin_mul_widen_even (vectype))
+  targetm.vectorize.builtin_mul_widen_odd
+  (decl2 = targetm.vectorize.builtin_mul_widen_odd (vectype))
+  TYPE_MODE (TREE_TYPE (TREE_TYPE (decl1)))
+== TYPE_MODE (wide_vectype))
+   {
+ for (i = 0; i  nunits_in; i++)
+   sel[i] = !BYTES_BIG_ENDIAN + (i  ~1)
++ ((i  1) ? nunits_in : 0);
+ if (0  can_vec_perm_p (vec_mode, false, sel))
+   icode = 0;
+   }
+ if (icode == CODE_FOR_nothing)
+   {
+ decl1 = NULL_TREE;
+ decl2 = 

Re: [wwwdocs] Update coding conventions for C++

2012-06-28 Thread Joseph S. Myers
On Wed, 27 Jun 2012, Lawrence Crowl wrote:

  +h4a name=Namespace_UseNamespaces/a/h4
  +
  +p
  +Namespaces are encouraged.
  +All separable libraries should have a unique global namespace.
  +All individual tools should have a unique global namespace.
  +Nested include directories names should map to nested namespaces when
  possible.
  +/p
 
  Do all people have a consensus on the use of namespace ?
 
 Well, we really only know about objections, and I have not seen any.

I certainly think namespaces are a useful feature to use in GCC (with a 
namespace for the gcc/ directory, or as you imply separate ones for the 
driver and the compilers proper, one for libcpp, one for each front end, 
etc.).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote:
  Also, this scenario of silently deciding whether or not to use the
  linker plugin could bring us to different test results for the same
  command lines.  I don't like that.
 
 Right, which is why the static configuration of the host system at build
 time is forever after an invariant.  The linker is smelled, it doesn't
 support plugins, therefore we can't ever use it, therefore we never build
 it...

THis test is not about whether to build the plugin, but whether to force
using it by default.  And to be able to use it by default, you need a
guarantee that all the linkers you'll use it with do support the plugin.
Therefore, if the build-time linker doesn't support it, I think it is just
fine not all of your linkers support the plugin and not enable it by
default.

Jakub


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 4:08 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote:
  Also, this scenario of silently deciding whether or not to use the
  linker plugin could bring us to different test results for the same
  command lines.  I don't like that.

 Right, which is why the static configuration of the host system at build
 time is forever after an invariant.  The linker is smelled, it doesn't
 support plugins, therefore we can't ever use it, therefore we never build
 it...

 THis test is not about whether to build the plugin, but whether to force
 using it by default.  And to be able to use it by default, you need a
 guarantee that all the linkers you'll use it with do support the plugin.
 Therefore, if the build-time linker doesn't support it, I think it is just
 fine not all of your linkers support the plugin and not enable it by
 default.

I'd like to have a more reliable way to enable/disable the default use
of the linker-plugin then.  Something in config.gcc maybe, or at least
a flag I can specify at configure time.  If the default in config.gcc is
detected to not work then explicitely changing that (or confirming it)
would be required - otherwise we'd error out.

Richard.


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Eric Botcazou
 Probably suppress both, since they no longer make sense (they are testing
 an early implementation of 'Old, before 'Old was standardized in Ada 2012).

 I'll take care of it.

Thanks!

-- 
Eric Botcazou


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Arnaud Charlet
  Probably suppress both, since they no longer make sense (they are testing
  an early implementation of 'Old, before 'Old was standardized in Ada
  2012).
 
  I'll take care of it.
 
 Thanks!

Sure, done for the record (revision 189042).


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 07:05, Jakub Jelinek wrote:
 Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
 to pessimize the generated code for gcc.dg/vect/pr51581-3.c
 testcase (at least with -O3 -mavx) compared to when the hooks aren't
 present, because i?86 has more natural support for widen mult lo/hi
 compoared to widen mult even/odd, but I assume that on powerpc it is the
 other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
 and builtin_mul_widen_* are possible for the particular vectype which one
 will be cheaper?

I would assume that if the builtin exists, then it is cheaper.

I disagree about x86 has more natural support for hi/lo.  The basic sse2 
multiplication is even.  One shift per input is needed to generate odd.  On the 
other hand, one interleave per input is required for both hi/lo.  So 4 setup 
insns for hi/lo, and 2 setup insns for even/odd.  And on top of all that, XOP 
includes multiply odd at least for signed V4SI.

I'll have a look at the test case you mention while I re-look at the patches...


r~


Re: [PATCH, GCC][AArch64] Use Enums for code models option selection

2012-06-28 Thread Tejas Belagod

Tejas Belagod wrote:

Marcus Shawcroft wrote:

On 13/06/12 14:38, Sofiane Naci wrote:

Hi,

I discovered a bug in my previous patch, so I attach a new one.
The ChangeLog hasn't changed.
OK to commit?

Thanks
Sofiane


-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]

On

Behalf Of Sofiane Naci
Sent: 31 May 2012 10:55
To: gcc-patches@gcc.gnu.org
Subject: [PATCH, GCC][AArch64] Use Enums for code models option selection

Hi,

This patch re-factors code models option selection in the AArch64 port:

  . Renaming variables such as mem_model to cmodel, for better clarity.
  . Using the generic support for enumerated option arguments.
  . Fixing touched code layout and formatting issues.

Thanks
Sofiane

-

ChangeLog:

2012-05-31  Sofiane Nacisofiane.n...@arm.com

[AArch64] Use Enums for code models option selection.

* config/aarch64/aarch64-elf-raw.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-linux.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-opts.h (enum aarch64_code_model): New.
* config/aarch64/aarch64-protos.h: Update comments.
* config/aarch64/aarch64.c: Update comments.
(aarch64_default_mem_model): Rename to aarch64_code_model.
(aarch64_expand_mov_immediate): Remove error message.
(aarch64_select_rtx_section): Remove assertion and update comment.
(aarch64_override_options): Move memory model initialization from
here.
(struct aarch64_mem_model): Delete.
(aarch64_memory_models[]): Delete.
(initialize_aarch64_memory_model): Rename to
initialize_aarch64_code_model
and update.
(aarch64_classify_symbol): Handle AARCH64_CMODEL_TINY and
AARCH64_CMODEL_TINY_PIC
* config/aarch64/aarch64.h
(enum aarch64_memory_model): Delete.
(aarch64_default_mem_model): Rename to aarch64_cmodel.
(HAS_LONG_COND_BRANCH): Update.
(HAS_LONG_UNCOND_BRANCH): Update.
* config/aarch64/aarch64.opt
(cmodel): New.
(mcmodel): Update.

OK




I've checked this in on aarch64-branch upstream for Sofiane.

Tejas.


Sorry, I broke the build when I applied this patch. Attached is a patch that 
fixes this. Build and regressions are happy. OK to commit?


Thanks,
Tejas Belagod.
ARM.

Changelog

2012-06-28  Tejas Belagod  tejas.bela...@arm.com

gcc/
* config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index ce2f899..5e24cd7 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -802,7 +802,7 @@ enum aarch64_builtins
 /* Check TLS Descriptors mechanism is selected.  */
 #define TARGET_TLS_DESC (aarch64_tls_dialect == TLS_DESCRIPTORS)
 
-extern enum aarch64_memory_model aarch64_cmodel;
+extern enum aarch64_code_model aarch64_cmodel;
 
 /* When using the tiny addressing model conditional and unconditional branches
can span the whole of the available address space (1MB).  */

[lra] trunk merged into the branch

2012-06-28 Thread Vladimir Makarov

I merged trunk at 188913 into lra branch.  Some changes were required to make 
lra branch bootstrapped on x86/x86-64 and ppc.

2012-06-23  Vladimir Makarovvmaka...@redhat.com

* lra.c (check_rtl): Add arg to insn_invalid_p call.

* lra-assigns.c (init_regno_assign_info): Use
ira_class_hard_regs_num instead of ira_available_class_regs.
(reload_pseudo_compare_func): Ditto.

* lra-constraints.c (extract_loc_address_regs): Set up disp_loc
first.  Transfer true for context_p only when base_reg_loc is
defined.  Add processing UNSPEC.
(process_addr_reg): Reload always for non-reg.
(equiv_address_substitution): Add arg to plus_constant calls.
(curr_insn_transform): Don't process addresses for operators.
Change duplication updates.
(inherit_reload_reg): Use ira_class_hard_regs_num instead of
ira_available_class_regs.

* lra-eliminations.c (for_sum, lra_eliminate_regs_1): Add arg to
plus_constant calls.
(eliminate_regs_in_insn): Ditto.

2012-06-25  Vladimir Makarovvmaka...@redhat.com

* output.h (alter_subreg): Add new argument.

* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.

* dbxout.c (dbxout_symbol_location): Ditto.

* final.c (final_scan_insn, cleanup_subreg_operands): Ditto.
(walk_alter_subreg, output_operand): Ditto.
(alter_subreg): Add new argument.

* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.

* config/rs6000/rs6000.c (rs6000_legitimate_offset_address_p):
Always pass true to legitimate_constant_pool_address_p when
lra_in_progress.
(rs6000_legitimate_address_p): Ditto.

* lra-int.h (lra_update_operator_dups): New.

* lra.c (lra): Put lra_in_progress after
lra_hard_reg_substitution.

* lra-spills.c (lra_hard_reg_substitution): Pass new argument to
alter_subreg.  Call lra_update_operator_dups.

* lra-eliminations.c (lra_eliminate_regs_1):  Pass new argument to
alter_subreg.

* lra-constraints.c (simplify_operand_subreg): Ditto.
(curr_insn_transform): Use lra_update_operator_dups.




Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 08:57:23AM -0700, Richard Henderson wrote:
 On 2012-06-28 07:05, Jakub Jelinek wrote:
  Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
  to pessimize the generated code for gcc.dg/vect/pr51581-3.c
  testcase (at least with -O3 -mavx) compared to when the hooks aren't
  present, because i?86 has more natural support for widen mult lo/hi
  compoared to widen mult even/odd, but I assume that on powerpc it is the
  other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
  and builtin_mul_widen_* are possible for the particular vectype which one
  will be cheaper?
 
 I would assume that if the builtin exists, then it is cheaper.
 
 I disagree about x86 has more natural support for hi/lo.  The basic sse2
 multiplication is even.  One shift per input is needed to generate odd. 
 On the other hand, one interleave per input is required for both hi/lo. 
 So 4 setup insns for hi/lo, and 2 setup insns for even/odd.  And on top of
 all that, XOP includes multiply odd at least for signed V4SI.

Perhaps the problem is then that the permutation is much more expensive
for even/odd.  With even/odd the f2 routine is:
vmovdqa d(%rip), %xmm2
vmovdqa .LC1(%rip), %xmm0
vpsrlq  $32, %xmm2, %xmm4
vmovdqa d+16(%rip), %xmm1
vpmuludq%xmm0, %xmm2, %xmm5
vpsrlq  $32, %xmm0, %xmm3
vpmuludq%xmm3, %xmm4, %xmm4
vpmuludq%xmm0, %xmm1, %xmm0
vmovdqa .LC2(%rip), %xmm2
vpsrlq  $32, %xmm1, %xmm1
vpmuludq%xmm3, %xmm1, %xmm3
vmovdqa .LC3(%rip), %xmm1
vpshufb %xmm2, %xmm5, %xmm5
vpshufb %xmm1, %xmm4, %xmm4
vpshufb %xmm2, %xmm0, %xmm2
vpshufb %xmm1, %xmm3, %xmm1
vpor%xmm4, %xmm5, %xmm4
vpor%xmm1, %xmm2, %xmm1
vpsrld  $1, %xmm4, %xmm4
vmovdqa %xmm4, c(%rip)
vpsrld  $1, %xmm1, %xmm1
vmovdqa %xmm1, c+16(%rip)
ret
and with lo/hi it is:
vmovdqa d(%rip), %xmm2
vpunpckhdq  %xmm2, %xmm2, %xmm3
vpunpckldq  %xmm2, %xmm2, %xmm2
vmovdqa .LC1(%rip), %xmm0
vpmuludq%xmm0, %xmm3, %xmm3
vmovdqa d+16(%rip), %xmm1
vpmuludq%xmm0, %xmm2, %xmm2
vshufps $221, %xmm2, %xmm3, %xmm2
vpsrld  $1, %xmm2, %xmm2
vmovdqa %xmm2, c(%rip)
vpunpckhdq  %xmm1, %xmm1, %xmm2
vpunpckldq  %xmm1, %xmm1, %xmm1
vpmuludq%xmm0, %xmm2, %xmm2
vpmuludq%xmm0, %xmm1, %xmm0
vshufps $221, %xmm0, %xmm2, %xmm0
vpsrld  $1, %xmm0, %xmm0
vmovdqa %xmm0, c+16(%rip)
ret

Jakub


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread H.J. Lu
On Thu, Jun 28, 2012 at 8:57 AM, Richard Henderson r...@redhat.com wrote:
 On 2012-06-28 07:05, Jakub Jelinek wrote:
 Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
 to pessimize the generated code for gcc.dg/vect/pr51581-3.c
 testcase (at least with -O3 -mavx) compared to when the hooks aren't
 present, because i?86 has more natural support for widen mult lo/hi
 compoared to widen mult even/odd, but I assume that on powerpc it is the
 other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
 and builtin_mul_widen_* are possible for the particular vectype which one
 will be cheaper?

 I would assume that if the builtin exists, then it is cheaper.

 I disagree about x86 has more natural support for hi/lo.  The basic sse2 
 multiplication is even.  One shift per input is needed to generate odd.  On 
 the other hand, one interleave per input is required for both hi/lo.  So 4 
 setup insns for hi/lo, and 2 setup insns for even/odd.  And on top of all 
 that, XOP includes multiply odd at least for signed V4SI.

 I'll have a look at the test case you mention while I re-look at the 
 patches...


The upper 128-bit of 256-bit AVX instructions aren't a good fit with the
current vectorizer infrastructure.


-- 
H.J.


Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-28 Thread Janis Johnson
On 06/27/2012 05:05 PM, Mike Stump wrote:
 On Jun 27, 2012, at 3:36 PM, Janis Johnson wrote:
 These scans from gcc.dg/vect/vect-50.c, and others similar to them in
 other vect tests, hurt my brain:

 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { xfail { vect_no_align } } } }  */
 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { target vect_hw_misalign } } } */

 Both of these PASS for i686-pc-linux-gnu, causing duplicate lines in the
 gcc test summary.  I'm pretty sure the following accomplishes the same
 goal:

 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { xfail { vect_no_align  { ! vect_hw_misalign } } } } } */
 
 I don't think so?  The first sets the xfail status for the testcase.  If you 
 change the condition, you can't the xfail state for some targets, which would 
 be wrong (without a vec person chiming in).

The two checks are run separately.  The first one runs everywhere and
is expected to fail for vect_no_align.  The second is only run for
vect_hw_misalign.  Targets for which vect_no_align is false and
vect_hw_misliang is true get two PASS reports.

 I'd like to think you can compose the two with some spelling...  I just don't 
 think this one is it.?

No, there is no way to combine target and xfail, although since we
intercept them we could presumably come up with a way to do that, with
syntax and semantics we design.

 I grepped around and found:
 
   /* { dg-message does break strict-aliasing  { target { *-*-*  lp64 } 
 xfail *-*-* } 8 } */
 
 which might have the right way to spell it, though, I always test to ensure 
 the construct does what I want.

Nope.  That should be flagged as an error by dg-message but it's
passed through GCC's process-message which ignore errors (a bug)
and simply ignores the directive.  I'm currently trying a fix to
not ignore errors from dg-error/dg-warning/dg-message and will then
fix up the broken tests.

 That is, run the check everywhere
 
 We don't want to run the test on other than vect_hw_misalign targets, right?
 

I don't know, but right now it's run everywhere at least once.

Janis


Re: [PATCH, GCC][AArch64] Use Enums for code models option selection

2012-06-28 Thread Richard Earnshaw
On 28/06/12 16:58, Tejas Belagod wrote:
 
 Sorry, I broke the build when I applied this patch. Attached is a patch that 
 fixes this. Build and regressions are happy. OK to commit?
 
 Thanks,
 Tejas Belagod.
 ARM.
 
 Changelog
 
 2012-06-28  Tejas Belagod  tejas.bela...@arm.com
 
 gcc/
   * config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name.
 
 

OK.

R.




Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 09:20, Jakub Jelinek wrote:
 Perhaps the problem is then that the permutation is much more expensive
 for even/odd.  With even/odd the f2 routine is:
...
 vpshufb %xmm2, %xmm5, %xmm5
 vpshufb %xmm1, %xmm4, %xmm4
 vpor%xmm4, %xmm5, %xmm4
...
 and with lo/hi it is:
 vshufps $221, %xmm2, %xmm3, %xmm2

Hmm.  That second has a reformatting delay.

Last week when I pulled the mulv4si3 routine out to i386.c,
I experimented with a few different options, including that
interleave+shufps sequence seen here for lo/hi.  See the 
comment there discussing options and timing.

This also shows a deficiency in our vec_perm logic:

0L 0H 2L 2H 1L 1H 3L 3H
0H 2H 0H 2H 1H 3H 1H 3H 2*pshufd
0H 1H 2H 3H punpckldq

without the permutation constants in memory.


r~


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 07:05, Jakub Jelinek wrote:
   PR tree-optimization/51581
   * tree-vect-stmts.c (permute_vec_elements): Add forward decl.
   (vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR
   also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus
   VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported.
   * tree-vect-patterns.c (vect_recog_divmod_pattern): Use
   MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts.
 
   * gcc.dg/vect/pr51581-4.c: New test.

Ok, except,

 +   if (0  can_vec_perm_p (vec_mode, false, sel))
 + icode = 0;

Testing hack left in.


r~


[C++ Pubnames Patch] Anonymous namespaces enclosed in named namespaces. (issue6343052)

2012-06-28 Thread Sterling Augustine
The enclosed patch adds a fix for the pubnames anonymous namespaces contained
within named namespaces, and adds an extensive test for the various pubnames.

The bug is that when printing at verbosity level 1, and lang_decl_name sees a
namespace decl in not in the global namespace, it prints the namespace's
enclosing scopes--so far so good. However, the code I added earlier this month
to handle anonymous namespaces also prints the enclosing scopes, so one would
get foo::foo::(anonymous namespace) instead of foo::(anonymous namespace).

The solution is to stop the added code from printing the enclosing scope, which
is correct for both verbosity levels 0 and 1. Level 2 is handled elsewhere and
so not relevant.

I have formalized the tests I have been using to be sure pubnames are correct
and include that in this patch. It is based on ccoutant's gdb_index_test.cc from
the gold test suite.

OK for mainline?

Sterling


gcc/cp/ChangeLog

2012-06-28  Sterling Augustine  saugust...@google.com

* error.c (lang_decl_name): Use TFF_UNQUALIFIED_NAME flag.

gcc/testsuite/ChangeLog

2012-06-28  Sterling Augustine  saugust...@google.com

* g++.dg/debug/dwarf2/pubnames-2.C: New.

Index: cp/error.c
===
--- cp/error.c  (revision 189025)
+++ cp/error.c  (working copy)
@@ -2633,7 +2633,7 @@
 dump_function_name (decl, TFF_PLAIN_IDENTIFIER);
   else if ((DECL_NAME (decl) == NULL_TREE)
 TREE_CODE (decl) == NAMESPACE_DECL)
-dump_decl (decl, TFF_PLAIN_IDENTIFIER);
+dump_decl (decl, TFF_PLAIN_IDENTIFIER | TFF_UNQUALIFIED_NAME);
   else
 dump_decl (DECL_NAME (decl), TFF_PLAIN_IDENTIFIER);

Index: testsuite/g++.dg/debug/dwarf2/pubnames-2.C
===
--- testsuite/g++.dg/debug/dwarf2/pubnames-2.C  (revision 0)
+++ testsuite/g++.dg/debug/dwarf2/pubnames-2.C  (revision 0)
@@ -0,0 +1,194 @@
+// { dg-do compile }
+// { dg-options -gpubnames -gdwarf-4 -std=c++0x -dA }
+// { dg-final { scan-assembler .section\t.debug_pubnames } }
+// { dg-final { scan-assembler \\\(anonymous namespace\\)0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \one0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \one::G_A0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::G_B0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::G_C0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::\\(anonymous namespace\\)0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \two0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \F_A0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \F_B0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \F_C0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \inline_func_10\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::c1::c10\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::c1::~c10\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \one::c1::val0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \check_enum0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \main0\+\[ \t\]+\[#;]+\[ \t\]+external 
name } }
+// { dg-final { scan-assembler \two::c2int::c20\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2double::c20\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2int const\\\*::c20\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \checkone::c10\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \checktwo::c2int \\0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \checktwo::c2double \\0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \checktwo::c2int const\\\* \\0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler \two::c2int::val0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2double::val0\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2int const\\\*::val0\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { scan-assembler 
\__static_initialization_and_destruction_00\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2int::~c20\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2double::~c20\+\[ \t\]+\[#;]+\[ 
\t\]+external name } }
+// { dg-final { scan-assembler \two::c2int const\\\*::~c20\+\[ 
\t\]+\[#;]+\[ \t\]+external name } }
+// { dg-final { 

[lra] a patch to fix last testsuite regression on x86/x86-64

2012-06-28 Thread Vladimir Makarov
The following patch fixes last GCC testsuite regression (in comparison 
with reload) on x86/x86-64 after last merge of trunk into lra.


The patch actually implements recent Bernd's optimization (restoring an 
argument pseudo value from the call result) in LRA.


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 189051.

2012-06-28  Vladimir Makarov vmaka...@redhat.com

* lra-constraints.c (inherit_in_ebb): Implement restoring argument
pseudo value from the call result.


Index: lra-constraints.c
===
--- lra-constraints.c	(revision 189016)
+++ lra-constraints.c	(working copy)
@@ -4344,7 +4344,7 @@ inherit_in_ebb (rtx head, rtx tail)
 {
   int i, src_regno, dst_regno;
   bool change_p, succ_p;
-  rtx prev_insn, next_usage_insns, set,  first_insn, last_insn, next_insn;
+  rtx prev_insn, next_usage_insns, set, first_insn, last_insn, next_insn;
   enum reg_class cl;
   struct lra_insn_reg *reg;
   basic_block last_processed_bb, curr_bb = NULL;
@@ -4354,7 +4354,6 @@ inherit_in_ebb (rtx head, rtx tail)
   bitmap_iterator bi;
   bool head_p, after_p;
 
-
   change_p = false;
   curr_usage_insns_check++;
   reloads_num = calls_num = 0;
@@ -4536,7 +4535,41 @@ inherit_in_ebb (rtx head, rtx tail)
   to_inherit[i].insns))
 	  change_p = true;
 	  if (CALL_P (curr_insn))
-	calls_num++;
+	{
+	  rtx cheap, pat, dest, restore;
+	  int regno, hard_regno;
+
+	  calls_num++;
+	  if ((cheap = find_reg_note (curr_insn,
+	  REG_RETURNED, NULL_RTX)) != NULL_RTX
+		   ((cheap = XEXP (cheap, 0)), true)
+		   (regno = REGNO (cheap)) = FIRST_PSEUDO_REGISTER
+		   (hard_regno = reg_renumber[regno]) = 0
+		  /* If there are pending saves/restores, the
+		 optimization is not worth.  */
+		   usage_insns[regno].calls_num == calls_num - 1
+		   TEST_HARD_REG_BIT (call_used_reg_set, hard_regno))
+		{
+		  /* Restore the pseudo from the call result as
+		 REG_RETURNED note says that the pseudo value is
+		 in the call result and the pseudo is an argument
+		 of the call.  */
+		  pat = PATTERN (curr_insn);
+		  if (GET_CODE (pat) == PARALLEL)
+		pat = XVECEXP (pat, 0, 0);
+		  dest = SET_DEST (pat);
+		  start_sequence ();
+		  emit_move_insn (cheap, copy_rtx (dest));
+		  restore = get_insns ();
+		  end_sequence ();
+		  lra_process_new_insns (curr_insn, NULL, restore,
+	 Inserting call parameter restore);
+		  /* We don't need to save/restore of the pseudo from
+		 this call.  */
+		  usage_insns[regno].calls_num = calls_num;
+		  bitmap_set_bit (check_only_regs, regno);
+		}
+	}
 	  to_inherit_num = 0;
 	  /* Process insn usages.  */
 	  for (reg = curr_id-regs; reg != NULL; reg = reg-next)


[PATCH][RFC, Reload]. Reload bug?

2012-06-28 Thread Tejas Belagod


Hi,

Attached is a fix for what seems to be a reload bug while handling 
subreg(mem...). I ran into this problem while implementing support for struct 
load/store in AArch64 using the standard patterns 
vec_loadstore_laneslarge_int_modevec_mode on the same lines of the ARM 
backend. The test case that caused the issue was:


void SexiALI_Convert(void *vdest, void *vsrc, unsigned int frames, int n)
{
 unsigned int x;
 short *src = vsrc;
 unsigned char *dest = vdest;
 for(x=0;x256;x++)
 {
  int tmp;
  tmp = *src;
  src++;
  tmp += *src;
  src++;
  *dest++ = tmp;
 }
}

Before reload, this is the RTL dump I see:

.
(insn 110 114 111 4 (set (reg:V8HI 158 [ vect_var_.21 ])
(subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (nil))

(insn 111 110 115 4 (set (reg:V8HI 159 [ vect_var_.22 ])
(subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (expr_list:REG_DEAD (reg:OI 530 [ vect_array.20 ])
(nil)))

(insn 115 111 116 4 (set (reg:V8HI 161 [ vect_var_.24 ])
(subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (nil))

(insn 116 115 117 4 (set (reg:V8HI 162 [ vect_var_.25 ])
(subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (expr_list:REG_DEAD (reg:OI 529 [ vect_array.23 ])
(nil)))

(insn 117 116 118 4 (set (reg:V4SI 544 [ vect_var_.27 ])
(sign_extend:V4SI (vec_select:V4HI (reg:V8HI 159 [ vect_var_.22 ])
(parallel:V8HI [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
] ice.i:11 700 {aarch64_simd_vec_unpacks_lo_v8hi}
 (nil))

(insn 118 117 124 4 (set (reg:V4SI 545 [ vect_var_.26 ])
(sign_extend:V4SI (vec_select:V4HI (reg:V8HI 158 [ vect_var_.21 ])
(parallel:V8HI [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
] ice.i:9 700 {aarch64_simd_vec_unpacks_lo_v8hi}
 (nil))

.

In insn 116, reg_equiv_mem () of the psuedoreg 529 is (mem:OI (reg sp)), and the 
subreg is equivalent to:

subreg:V8HI (mem:OI (reg sp) 16)
which does not get folded into
mem:V8HI (plus:DI (reg sp) (const_int 16))
because, in reload.c:find_reloads_toplev () where such subregs are narrowed into
narower memrefs, the memref supplied to strict_memory_address_addr_space_P () is
just (mem:OI (reg sp)) and the SUBREG_BYTE is forgotten. Therefore
strict_memory_address_addr_space_P () thinks that (mem:OI (reg sp)) is a
valid target address and lets it pass as a subreg and does not narrow the subreg
into a narrower memref. find_reloads_toplev () should have infact given
strict_memory_address_addr_space_P () (mem:OI (plus:DI (reg sp) (const_int 16)) 
) which will be returned as false as base+offset is invalid for NEON addressing

modes and this will be reloaded into a narrower memref.

Also, I tried writing a secondary reload for this, but at no time is the RTL
(subreg:V8HI (mem:OI (reg sp)) 16)
available to the target secondary reload for it to fix it up.

Therefore, I've fixed find_reloads_toplev () to pass the full address to 
strict_memory_address_addr_space_P () in the case of subregs.


Does this look like a sane fix?

I've tested this patch on arm-none-eabi and bootstrapped on x86_64-pc-linux and
all is well.

Thanks,
Tejas Belagod.
ARM.

Changelog:

2012-06-28  Tejas Belagod  tejas.bela...@arm.com

gcc/
* reload.c (find_reloads_toplev): Include the subreg byte in the address
of memrefs when converting subregs of mems into narrower memrefs.diff --git a/gcc/reload.c b/gcc/reload.c
index e42cc5c..b6d4ce9 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -4771,15 +4771,27 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type 
type,
 #ifdef LOAD_EXTEND_OP
   !paradoxical_subreg_p (x)
 #endif
-  (reg_equiv_address (regno) != 0
- || (reg_equiv_mem (regno) != 0
-  (! strict_memory_address_addr_space_p
- (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
-  MEM_ADDR_SPACE (reg_equiv_mem (regno)))
- || ! offsettable_memref_p (reg_equiv_mem (regno))
- || num_not_at_initial_offset
-   x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
-  insn, address_reloaded);
+)
+   {
+ if (reg_equiv_address (regno) != 0)
+   x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
+insn, address_reloaded);
+ else if (reg_equiv_mem (regno) != 0)
+   {
+ tem =
+   

Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Andrew Pinski
On Thu, Jun 28, 2012 at 6:50 AM, Matthew Gretton-Dann
matthew.gretton-d...@arm.com wrote:
 On 28/06/12 14:38, Mike Stump wrote:

 On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann

 matthew.gretton-d...@arm.com wrote:

 On 27/06/12 21:35, Andrew Pinski wrote:

 On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
 matthew.gretton-d...@arm.com wrote:

 All,

 This patch enables the dump-noaddr test to work in out-of-build-tree
 testing.

 [snip]


 I created a much simpler patch which I have been meaning to submit.
 I attached it for reference.


 Thanks,
 Andrew Pinski

 ChangeLog:
 * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
 an absolute dump base instead of a relative one.

 Index: gcc.c-torture/unsorted/dump-noaddr.x
 ===
 --- gcc.c-torture/unsorted/dump-noaddr.x    (revision 61452)
 +++ gcc.c-torture/unsorted/dump-noaddr.x    (revision 61453)
 @@ -11,10 +11,10 @@ proc dump_compare { src options } {
      foreach option $option_list {
      file delete -force dump1
      file mkdir dump1
 -    c-torture-compile $src $option $options -dumpbase dump1/$dumpbase
 -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all
 -fdump-tree-all -fdump-noaddr
 +    c-torture-compile $src $option $options -dumpbase
 [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1
 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
      file delete -force dump2
      file mkdir dump2
 -    c-torture-compile $src $option $options -dumpbase dump2/$dumpbase
 -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
 +    c-torture-compile $src $option $options -dumpbase
 [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all
 -fdump-tree-all -fdump-noaddr
      foreach dump1 [lsort [glob -nocomplain dump1/*]] {
          regsub dump1/ $dump1 dump2/ dump2
          set dumptail gcc.c-torture/unsorted/[file tail $dump1]


 What I don't like about this approach is that dump1 and dump2 are

 created in the current working directory.


 On vxworks as I recall we did a cd to tmpdir, is that generally true?
 Also, if one telnets in or sshes into the host under test, the cd is
 mandatory... as otherwise one would dump turds (that's a technical term)
 in the home directory which would be very uncool.  Maybe a better
 approach would be to cd to the right place if all the Canadian setups cd,
 as that then unifies them.

 With out of build-tree testing this may not (I believe) be the same as
 $tmpdir (where temporaries are normally created).  Also the current
 directory may already contain directories/files called dump1 or dump2
 which will get destroyed by running the


 The point of the cd was to get to a place where temps can be created
 freely...

 I've not committed my version yet in case I am missing something in my
 reasoning above with regards to the relationship between the current
 working directory and $tmpdir.


 So the question would be, does his patch work for you?  It was unclear to
 me if the answer is no.


 Sorry - the patch works for my use case (build==host), but I was concerned
 over the use of [pwd] vs $tmpdir.

Both will work in the case of build==host.  I don't even know if we
really support build!=host testing at all.  I have never seen it done
and I have no idea how to control it via dejagnu.  Has anyone tested
build!=host recently?

Thanks,
Andrew Pinski


 Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the
 directory on the target, the host or the build machine?  And is that
 going to the host machine?  They are not the same.  One needs a directory
 on the host machine.


 I don't think this applies to my patch though, so are you still okay for my
 version to go in or is there something else I haven't considered?


 Thanks,

 Matt

 --
 Matthew Gretton-Dann
 Principal Engineer, PD Software - Tools, ARM Ltd


 --
 Matthew Gretton-Dann
 Principal Engineer, PD Software - Tools, ARM Ltd




Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 11:42 AM, Andrew Pinski wrote:
 Both will work in the case of build==host.  I don't even know if we
 really support build!=host testing at all.

Sure...  works just fine, last I knew.  Generally easy enough to fixup, if 
people get it wrong.

 I have never seen it done and I have no idea how to control it via dejagnu.  
 Has anyone tested
 build!=host recently?

Be curious to know if people do this anymore.  Host testing a lame OS, like 
MS-DOS...  was why it was put in.


[PATCH] Fix PR46556 (straight-line strength reduction, part 2)

2012-06-28 Thread William J. Schmidt
Here's a relatively small piece of strength reduction that solves that
pesky addressing bug that got me looking at this in the first place...

The main part of the code is the stuff that was reviewed last year, but
which needed to find a good home.  So hopefully that's in pretty good
shape.  I recast base_cand_map as an htab again since I now need to look
up trees other than SSA names.  I plan to put together a follow-up patch
to change code and commentary references so that base_name becomes
base_expr.  Doing that now would clutter up the patch too much.

Bootstrapped and tested on powerpc64-linux-gnu with no new regressions.
Ok for trunk?

Thanks,
Bill


gcc:

PR tree-optimization/46556
* gimple-ssa-strength-reduction.c (enum cand_kind): Add CAND_REF.
(base_cand_map): Change to hash table.
(base_cand_hash): New function.
(base_cand_free): Likewise.
(base_cand_eq): Likewise.
(lookup_cand): Change base_cand_map to hash table.
(find_basis_for_candidate): Likewise.
(base_cand_from_table): Exclude CAND_REF.
(restructure_reference): New function.
(slsr_process_ref): Likewise.
(find_candidates_in_block): Call slsr_process_ref.
(dump_candidate): Handle CAND_REF.
(base_cand_dump_callback): New function.
(dump_cand_chains): Change base_cand_map to hash table.
(replace_ref): New function.
(replace_refs): Likewise.
(analyze_candidates_and_replace): Call replace_refs.
(execute_strength_reduction): Change base_cand_map to hash table.

gcc/testsuite:

PR tree-optimization/46556
* testsuite/gcc.dg/tree-ssa/slsr-27.c: New.
* testsuite/gcc.dg/tree-ssa/slsr-28.c: New.
* testsuite/gcc.dg/tree-ssa/slsr-29.c: New.


Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-dom2 } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p-a[n], p-c[n], p-b[n]);
+}
+
+/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 3 dom2 } } 
*/
+/* { dg-final { cleanup-tree-dump dom2 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-dom2 } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p-a[n], p-c[n], p-b[n]);
+  if (n  12)
+foo (p-a[n], p-c[n], p-b[n]);
+  else if (n  3)
+foo (p-b[n], p-a[n], p-c[n]);
+}
+
+/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 9 dom2 } } 
*/
+/* { dg-final { cleanup-tree-dump dom2 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-dom2 } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p-a[n], p-c[n], p-b[n]);
+  if (n  3)
+{
+  foo (p-a[n], p-c[n], p-b[n]);
+  if (n  12)
+   foo (p-b[n], p-a[n], p-c[n]);
+}
+}
+
+/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */
+/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 9 dom2 } } 
*/
+/* { dg-final { cleanup-tree-dump dom2 } } */
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 189025)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -32,7 +32,7 @@ along with GCC; see the file COPYING3.  If not see
2) Explicit multiplies, unknown constant multipliers,
   no conditional increments. (data gathering complete,
   replacements pending)
-   3) Implicit multiplies in addressing expressions. (pending)
+   3) Implicit multiplies in addressing expressions. (complete)
4) Explicit multiplies, conditional increments. (pending)
 
It would also be possible to apply strength 

Re: [wwwdocs] Update coding conventions for C++

2012-06-28 Thread Lawrence Crowl
On 6/27/12, Lawrence Crowl cr...@google.com wrote:
 ..., does anyone object to removing the permission to use C++
 streams?

Having heard no objection, I removed the permission.

The following patch is the current state of the changes.  Since the
discussion appears to have died down, can I commit this patch?

BTW, as before, I have removed the html tags from this patch,
as they cause the mail server to reject the patch.


Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.66
diff -u -u -r1.66 codingconventions.html
--- htdocs/codingconventions.html   19 Feb 2012 00:45:34 -  1.66
+++ htdocs/codingconventions.html   28 Jun 2012 22:03:38 -
@@ -15,8 +19,73 @@
 code to follow these conventions, it is best to send changes to follow
 the conventions separately from any other changes to the code./p

+ul
+lia href=#DocumentationDocumentation/a/li
+lia href=#ChangeLogsChangeLogs/a/li
+lia href=#PortabilityPortability/a/li
+lia href=#MakefilesMakefiles/a/li
+lia href=#TestsuiteTestsuite Conventions/a/li
+lia href=#DiagnosticsDiagnostics Conventions/a/li
+lia href=#SpellingSpelling, terminology and markup/a/li
+lia href=#CandCxxC and C++ Language Conventions/a
+ul
+lia href=#C_OptionsCompiler Options/a/li
+lia href=#C_LanguageLanguage Use/a
+ul
+lia href=#AssertionsAssertions/a/li
+lia href=#CharacterCharacter Testing/a/li
+lia href=#ErrorError Node Testing/a/li
+lia href=#GeneratedParameters Affecting Generated Code/a/li
+lia href=#C_InliningInlining Functions/a/li
+/ul
+/li
+lia href=#C_FormattingFormatting Conventions/a
+ul
+lia href=#LineLine Length/a/li
+lia href=#C_NamesNames/a/li
+lia href=#ExpressionsExpressions/a/li
+/ul
+/li
+/ul
+/li
+lia href=#Cxx_ConventionsC++ Language Conventions/a
+ul
+lia href=#Cxx_LanguageLanguage Use/a
+ul
+lia href=#VariableVariable Definitions/a/li
+lia href=#Struct_UseStruct Definitions/a/li
+lia href=#Class_UseClass Definitions/a/li
+lia href=#ConstructorsConstructors and Destructors/a/li
+lia href=#ConversionsConversions/a/li
+lia href=#Over_FuncOverloading Functions/a/li
+lia href=#Over_OperOverloading Operators/a/li
+lia href=#DefaultDefault Arguments/a/li
+lia href=#Cxx_InliningInlining Functions/a/li
+lia href=#Template_UseTemplates/a/li
+lia href=#Namespace_UseNamespaces/a/li
+lia href=#RTTIRTTI and codedynamic_cast/code/a/li
+lia href=#CastsOther Casts/a/li
+lia href=#ExceptionsExceptions/a/li
+lia href=#Standard_LibraryThe Standard Library/a/li
+/ul
+/li
+lia href=#Cxx_FormattingFormatting Conventions/a
+ul
+lia href=#Cxx_NamesNames/a/li
+lia href=#Struct_FormStruct Definitions/a/li
+lia href=#Class_FormClass Definitions/a/li
+lia href=#Member_FormClass Member Definitions/a/li
+lia href=#Template_FormTemplates/a/li
+lia href=#ExternCExtern C/a/li
+lia href=#Namespace_FormNamespaces/a/li
+/ul
+/li
+/ul
+/li
+/ul

-h2Documentation/h2
+
+h2a name=DocumentationDocumentation/a/h2

 pDocumentation, both of user interfaces and of internals, must be
 maintained and kept up to date.  In particular:/p
@@ -43,7 +112,7 @@
 /ul


-h2ChangeLogs/h2
+h2a name=ChangeLogsChangeLogs/a/h2

 pGCC requires ChangeLog entries for documentation changes; for the web
 pages (apart from codejava//code and codelibstdc++//code) the CVS
@@ -71,20 +140,40 @@
 codejava/58/code is the actual number of the PR) at the top
 of the ChangeLog entry./p

-h2Portability/h2
+h2a name=PortabilityPortability/a/h2

 pThere are strict requirements for portability of code in GCC to
-older systems whose compilers do not implement all of the ISO C standard.
-GCC requires at least an ANSI C89 or ISO C90 host compiler, and code
-should avoid pre-standard style function definitions, unnecessary
-function prototypes and use of the now deprecated @code{PARAMS} macro.
+older systems whose compilers do not implement all of the
+latest ISO C and C++ standards.
+/p
+
+p
+The directories
+codegcc/code, codelibcpp/code and codefixincludes/code
+may use C++03.
+They may also use the codelong long/code type
+if the host C++ compiler supports it.
+These directories should use reasonably portable parts of C++03,
+so that it is possible to build GCC with C++ compilers other than GCC itself.
+If testing reveals that
+reasonably recent versions of non-GCC C++ compilers cannot compile GCC,
+then GCC code should be adjusted accordingly.
+(Avoiding unusual language constructs helps immensely.)
+Furthermore,
+these directories emshould/em also be compatible with C++11.
+/p
+
+p
+The directories libiberty and libdecnumber must use C
+and 

Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Mike Stump mikest...@comcast.net wrote:

 On Jun 28, 2012, at 4:39 AM, Alexandre Oliva aol...@redhat.com wrote:
 That still doesn't sound right to me: why should the compiler refrain
 from using a perfectly functional linker plugin on the machine where
 it's installed (not where it's built?

 See your point below for one reason.

My point below suggests a reason for us to *verbosely* indicate the
change, e.g., in the test command line, like my patch does.

 The next would be because it would be a speed hit to re-check at
 runtime the qualities of the linker and do something different.

But then, our testsuite *does* re-check at runtime, but without my
patch, we're not using completely the result of the test.

 If the system had an architecture to avoid the speed hit and people
 wanted to do the work to support the runtime reconfigure, that'd be
 fine with me.

Me too, but I'm not arguing for or against that.  I'm just arguing for a
change to the test harness that will use the result of the dynamic test,
and verbosely so.

 Also, this scenario of silently deciding whether or not to use the
 linker plugin could bring us to different test results for the same
 command lines.  I don't like that.

 Right, which is why the static configuration of the host system at
 build time is forever after an invariant.

That doesn't even match *current* reality.

We can run the testsuite on a machine that's neither the build system
nor the run-time target.  That's presumably why the test harness tests
whether the plugin works.  And that's one reason why we should use that
result instead of letting the compiler override it.

 The linker is smelled, it doesn't support plugins, therefore we can't
 ever use it, therefore we never build it...

'cept even in the build system it *does* support plugins, so it's just
reasonable for us to build the plugin, and for the compiler to expect to
be able to use it.

Now, this will work just fine if the compiler is installed on a system
that matches the host=target (i.e., native compiler) triplet specified
when building the compiler.  It might not work on the build machine, but
that's irrelevant, for we're not supposed to be able to use the compiler
on the build machine.  It might not work on the test machine, and that's
why the test harness tests for plugin support.  But the test harness
doesn't communicate back to the compiler its findings without my patch,
so if the test system doesn't happen to support plugins, we'd get tons
of pointless failures.

If we change the compiler configuration so that it disables the plugin
just because it guesses some potential incompatibility between the
linker and the plugin we're about to build, we'll lose features and
testing.

If we change the compiler to detect it dynamically, we'll get ambiguous
test results.  “did this -flto test use the plugin or not?”

Why would you want any of the scenarios in the two paragraphs above?

If you wouldn't, what do you have against the patch that complements the
plugin detection on the test machine in the test harness?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [Patch, libgfortran] Add FPU Support for powerpc

2012-06-28 Thread Steven Bosscher
On Tue, May 22, 2012 at 3:45 AM, rbmj r...@verizon.net wrote:
 Hi everyone,

 This patch adds FPU support for powerpc on platforms that do not have glibc.
  It is basically the same code as glibc has.  The motivation for this was
 that right now there is no fpu-target.h that works for powerpc-*-vxworks.

 Again, 90% of this code comes directly from glibc.  But on vxworks targets
 there is no glibc.

 I also patched the configure.host script in order to add this in.

 Any opinions?

Since AFAICT nobody has responded...

I suppose this is something you need, or you would probably not be
working on it. I wouldn't have thought of VxWorks as an obvious target
platform for a Fortran compiler. :-)

The copying of the code from glibc (LGPL code) to libgfortran
(GPL+exception) is something that you probably need permission for
from the FSF. For the VxWorks specific bits, you could poke the only
listed VxWorks maintainer in MAINTAINERS (hi Nathan!).

For the configure.host bits,

+  powerpc)

Not powerpc64? Or at least powerpc|ppc?
IIUC this test is overridden for powerpc-linux by a glibc test
following your new code, right? What happens for e.g. powerpc-aix?
Shouldn't your test also be conditional on have_feenableexcept?

Ciao!
Steven


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Bernhard Reutner-Fischer
Rehi Janis,

Good to see you active again :)

Perhaps you want to pursue this? We'd need to suggest this to dejagnu,
have it in a release and bump the minimum required deja version of gcc.
So it may take time but IMO would be a worthwhile cleanup.
Or do you see a better way to handle this properly?

The first patch below is the dejagnu part, the other patch is the
corresponding follow-up for gcc.

cheers,
Bernhard

On Fri, Mar 16, 2012 at 03:59:58PM +0100, Bernhard Reutner-Fischer wrote:
On Fri, Mar 16, 2012 at 11:04:45AM +0100, Bernhard Reutner-Fischer wrote:

The underlying problem is that dejagnu's runtest.exp only allows for a
single libdir where it searches for includes -- see comment in
libgomp.exp and libitm.exp

While just adding more and more load_gcc_lib calls to users outside of
gcc/ is the easy way out, it is (IMHO) error prone (i ran make check
just in gcc and not in toplevel, fixed my script now).

It would be desirable if dejagnu would just find all the currently
load_gcc_lib'ed files on its own, via load_lib.
One could
- teach dejagnu to treat libdir as a list of paths

The attached works for me for a toplevel make -k check (double-checked
with individual make check in lib{gomp,itm}). I do not intend to pursue
this any further.

runtest.exp: add libdirs list for load_lib()

libgomp wants to load .exp files from ../gcc/testsuite/lib.
Instrument load_lib to be able to find the files.
Previously we used to have a helper proc that had to first load all
dependent .exp manually and then, again manually, the desired .exp.

2012-03-16  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   * runtest.exp (libdirs): New global list.
   (load_lib): Append libdirs to search_and_load_files directories.

diff --git a/runtest.exp b/runtest.exp
index 4bfed83..8e6a7de 100644
--- a/runtest.exp
+++ b/runtest.exp
@@ -589,7 +589,7 @@ proc lookfor_file { dir name } {
 # source tree, (up one or two levels), then in the current dir.
 #
 proc load_lib { file } {
-global verbose libdir srcdir base_dir execpath tool
+global verbose libdir libdirs srcdir base_dir execpath tool
 global loaded_libs
 
 if {[info exists loaded_libs($file)]} {
@@ -597,8 +597,11 @@ proc load_lib { file } {
 }
 
 set loaded_libs($file) 
-
-if { [search_and_load_file library file $file [list ../lib $libdir 
$libdir/lib [file dirname [file dirname $srcdir]]/dejagnu/lib $srcdir/lib 
$execpath/lib . [file dirname [file dirname [file dirname 
$srcdir]]]/dejagnu/lib]] == 0 } {
+set search_dirs [list ../lib $libdir $libdir/lib [file dirname [file 
dirname $srcdir]]/dejagnu/lib $srcdir/lib $execpath/lib . [file dirname [file 
dirname [file dirname $srcdir]]]/dejagnu/lib]
+if {[info exists libdirs]} {
+lappend search_dirs $libdirs
+}
+if { [search_and_load_file library file $file $search_dirs ] == 0 } {
   send_error ERROR: Couldn't find library file $file.\n
   exit 1
 }
@@ -652,6 +655,8 @@ set libdir   [file dirname $execpath]/dejagnu
 if {[info exists env(DEJAGNULIBS)]} {
 set libdir $env(DEJAGNULIBS)
 }
+# list of extra directories for load_lib
+set libdirs {}
 
 verbose Using $libdir to find libraries
 

libgomp/ChangeLog

2012-03-16  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   * testsuite/lib/libgomp.exp: Set libdirs. Remove now redundant
   manual inclusion of gfortran-dg's dependencies.

libitm/ChangeLog

2012-03-16  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   * testsuite/lib/libitm.exp: Set libdirs. Remove now redundant
   manual inclusion of gcc-dg's dependencies.


diff --git a/libgomp/testsuite/lib/libgomp.exp 
b/libgomp/testsuite/lib/libgomp.exp
index 02909f8..54e1e652 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -1,32 +1,12 @@
-# Damn dejagnu for not having proper library search paths for load_lib.
-# We have to explicitly load everything that gcc-dg.exp wants to load.
+global libdirs
+lappend libdirs $srcdir/../../gcc/testsuite/lib
 
-proc load_gcc_lib { filename } {
-global srcdir loaded_libs
+load_lib dg.exp
 
-load_file $srcdir/../../gcc/testsuite/lib/$filename
-set loaded_libs($filename) 
-}
+# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs!
+load_lib gcc-defs.exp
 
-load_lib dg.exp
-load_gcc_lib file-format.exp
-load_gcc_lib target-supports.exp
-load_gcc_lib target-supports-dg.exp
-load_gcc_lib scanasm.exp
-load_gcc_lib scandump.exp
-load_gcc_lib scanrtl.exp
-load_gcc_lib scantree.exp
-load_gcc_lib scanipa.exp
-load_gcc_lib prune.exp
-load_gcc_lib target-libpath.exp
-load_gcc_lib wrapper.exp
-load_gcc_lib gcc-defs.exp
-load_gcc_lib torture-options.exp
-load_gcc_lib timeout.exp
-load_gcc_lib timeout-dg.exp
-load_gcc_lib fortran-modules.exp
-load_gcc_lib gcc-dg.exp
-load_gcc_lib gfortran-dg.exp
+load_lib gfortran-dg.exp
 
 set dg-do-what-default run
 
diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp
index f322ed5..1ac8f31 

Fwd: [Bug debug/53754] [4.8 Regression][lto] ICE in lhd_decl_printable_name, at langhooks.c:222 (with -g)

2012-06-28 Thread Cary Coutant
[resending in plain text. Sorry, gmail defaulted to HTML.]

Ping. I'm not looking for commit approval yet, just advice on how
thorough we need to be to support -g and LTO together.

(What's the right way to send a patch to fix a PR? I'm not even sure
whether you were cc'ed on my response.)

-cary


-- Forwarded message --
From: ccoutant at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org
Date: Mon, Jun 25, 2012 at 2:19 PM
Subject: [Bug debug/53754] [4.8 Regression][lto] ICE in
lhd_decl_printable_name, at langhooks.c:222 (with -g)
To: ccout...@google.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754

Cary Coutant ccoutant at gcc dot gnu.org changed:

          What    |Removed                     |Added

            Status|NEW                         |ASSIGNED
        AssignedTo|unassigned at gcc dot       |ccoutant at gcc dot gnu.org
                  |gnu.org                     |

--- Comment #4 from Cary Coutant ccoutant at gcc dot gnu.org
2012-06-25 21:19:17 UTC ---
Created attachment 27705
 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705
Patch to fix ICE with -g -flto and anonymous namespace

 You can't delay producing pubnames this way with LTO.  Please fix.

The obvious problem is that we're calling langhooks.dwarf_name (in
gen_namespace_die) for an anonymous namespace, even with the default
-gno-pubnames. I can fix that by adding a check for want_pubnames just before
the call to add_pubname_string, as in the patch below. But this is still going
to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix
that is relax the assert in lhd_decl_printable_name, and just have it return an
empty string in the DECL_NAMELESS case. That will not produce the right results
for an anonmyous namespace, but without front-end langhooks available to us
(and until we implement the lazy debug plan), how can we do better?

How much is expected to work today with LTO and -g? Aren't we still stuck with
calling langhooks from dwarf2out.c back-end routines? I can understand that we
don't want to ICE, but what guarantees do we make about debug info?

-cary

--
Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.



On Mon, Jun 25, 2012 at 2:19 PM, ccoutant at gcc dot gnu.org
gcc-bugzi...@gcc.gnu.org wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754

 Cary Coutant ccoutant at gcc dot gnu.org changed:

           What    |Removed                     |Added
 
             Status|NEW                         |ASSIGNED
         AssignedTo|unassigned at gcc dot       |ccoutant at gcc dot gnu.org
                   |gnu.org                     |

 --- Comment #4 from Cary Coutant ccoutant at gcc dot gnu.org 2012-06-25 
 21:19:17 UTC ---
 Created attachment 27705
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705
 Patch to fix ICE with -g -flto and anonymous namespace

  You can't delay producing pubnames this way with LTO.  Please fix.

 The obvious problem is that we're calling langhooks.dwarf_name (in
 gen_namespace_die) for an anonymous namespace, even with the default
 -gno-pubnames. I can fix that by adding a check for want_pubnames just before
 the call to add_pubname_string, as in the patch below. But this is still going
 to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix
 that is relax the assert in lhd_decl_printable_name, and just have it return 
 an
 empty string in the DECL_NAMELESS case. That will not produce the right 
 results
 for an anonmyous namespace, but without front-end langhooks available to us
 (and until we implement the lazy debug plan), how can we do better?

 How much is expected to work today with LTO and -g? Aren't we still stuck with
 calling langhooks from dwarf2out.c back-end routines? I can understand that we
 don't want to ICE, but what guarantees do we make about debug info?

 -cary

 --
 Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
 --- You are receiving this mail because: ---
 You are on the CC list for the bug.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
 Perhaps you want to pursue this? We'd need to suggest this to dejagnu,

Actually, we have the technology, so that isn't necessary.  :-)  You can 
install replacements for any procs you want, not pretty, but... it does work.  
I think this is a more deterministic path forward than waiting for a mythical 
dejagnu release.  Also, we then can avoid the hassle of requiring a new dejagnu.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Bernhard Reutner-Fischer
On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote:
On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
 Perhaps you want to pursue this? We'd need to suggest this to dejagnu,

Actually, we have the technology, so that isn't necessary.  :-)  You can 
install replacements for any procs you want, not pretty, but... it does work.  
I think this is a more deterministic path forward than waiting for a mythical 
dejagnu release.  Also, we then can avoid the hassle of requiring a new 
dejagnu.

Wouldn't that mean that we have to completely replace proc load_lib?
But anyway.
Mike, it would be nice if you could fix
+# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs!

if you did not do that already -- TIA :)

That's under the assumption that one should be able to use the major
lib/*exp without including their pre-requisites first.

cheers,


[testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors

2012-06-28 Thread Janis Johnson
Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of
target and xfail selectors in a way that would be nice if it worked,
but it doesn't.  Unfortunately the local code to override dg-error and
friends ignores errors, so directives with errors have been silently
skipped.  I plan to fix that after fixing the affected tests.

This patch causes the affected dg-message directives in this test to be
XFAIL'd everywhere, with a comment asking that when the test starts
passing on the relevant targets, the xfail be replaced with a target
list.  It also adds comments to the dg-message directives to make their
messages unique in the test summary.

Tested on i686-pc-linux-gnu; OK for trunk?

Janis
2012-06-28  Janis Johnson  jani...@codesourcery.com

* gcc.dg/Wstrict-aliasing-converted-assigned.c: Fix syntax
errors in dg-message directives, add comments.

Index: gcc.dg/Wstrict-aliasing-converted-assigned.c
===
--- gcc.dg/Wstrict-aliasing-converted-assigned.c(revision 189025)
+++ gcc.dg/Wstrict-aliasing-converted-assigned.c(working copy)
@@ -5,9 +5,12 @@
 int foo()
 {
   int i;
-  *(long*)i = 0;  /* { dg-warning type-punn } */
+  *(long*)i = 0;  /* { dg-warning type-punn type-punn } */
   return i;
 }
 
-/* { dg-message does break strict-aliasing  { target { *-*-*  lp64 } 
xfail *-*-* } 8 } */
-/* { dg-message initialized  { target { *-*-*  lp64 } xfail *-*-* } 8 } 
*/
+/* These messages are only expected for lp64, but fail there.  When they
+   pass for lp64, replace xfail *-*-* with target lp64.  */
+
+/* { dg-message does break strict-aliasing break { xfail *-*-* } 8 } */
+/* { dg-message initialized init { xfail *-*-* } 8 } */


[testsuite] add required comments to dg-message directives in g++.dg

2012-06-28 Thread Janis Johnson
Several tests in g++.dg use dg-message with a target list and line
number but without the comment field, which is required when those
additional arguments are used.  The local replacement of dg-message
silently ignores errors (something I plan to fix), so the checks have
been ignored.  Unprocessed notes (as opposed to errors and warning)
in compiler output are intentionally ignored, so this wasn't noticed
before..

This patch adds the required comments, and the tests now pass on
i686-pc-linux-gnu.  OK for trunk?

Janis
2012-06-28  Janis Johnson  jani...@codesourcery.com

* g++.dg/template/error46.C: Add missing comment to dg-message.
* g++.dg/template/crash107.C: Likewise.
* g++.dg/template/error47.C: Likewise.
* g++.dg/template/crash108.C: Likewise.
* g++.dg/overload/operator5.C: Likewise.

Index: g++.dg/template/error46.C
===
--- g++.dg/template/error46.C   (revision 189025)
+++ g++.dg/template/error46.C   (working copy)
@@ -8,4 +8,4 @@
 {
   foo(A0(), A1()); // { dg-error no matching }
 }
-// { dg-message candidate|parameter 'N' ('0' and '1') { target *-*-* } 9 }
+// { dg-message candidate|parameter 'N' ('0' and '1')  { target *-*-* } 9 }
Index: g++.dg/template/crash107.C
===
--- g++.dg/template/crash107.C  (revision 189025)
+++ g++.dg/template/crash107.C  (working copy)
@@ -14,7 +14,7 @@
 }
 };
 Vecdouble v(3,4,12); // { dg-error no matching }
-// { dg-message note { target *-*-* } 16 }
+// { dg-message note note { target *-*-* } 16 }
 Vecdouble V(12,4,3);  // { dg-error no matching }
-// { dg-message note { target *-*-* } 18 }
+// { dg-message note note { target *-*-* } 18 }
 Vecdouble c = v^V;   // { dg-message required }
Index: g++.dg/template/error47.C
===
--- g++.dg/template/error47.C   (revision 189025)
+++ g++.dg/template/error47.C   (working copy)
@@ -6,4 +6,4 @@
 {
   foo(0, p); // { dg-error no matching }
 }
-// { dg-message candidate|parameter 'T' ('int' and 'void*') { target *-*-* } 
7 }
+// { dg-message candidate|parameter 'T' ('int' and 'void*')  { target 
*-*-* } 7 }
Index: g++.dg/template/crash108.C
===
--- g++.dg/template/crash108.C  (revision 189025)
+++ g++.dg/template/crash108.C  (working copy)
@@ -2,4 +2,4 @@
 
 templateclass T struct A {A(int b=k(0));}; // { dg-error arguments }
 void f(int k){Aint a;} // // { dg-error parameter|declared }
-// { dg-message note { target *-*-* } 3 }
+// { dg-message note note { target *-*-* } 3 }
Index: g++.dg/overload/operator5.C
===
--- g++.dg/overload/operator5.C (revision 189025)
+++ g++.dg/overload/operator5.C (working copy)
@@ -13,4 +13,4 @@
   const String b,
   bool ignoreCase) {
   return ignoreCase ? equalIgnoringCase(a, b) : (a == b); } // { dg-error 
ambiguous }
-// { dg-message note { target *-*-* } 15 }
+// { dg-message note note { target *-*-* } 15 }


[testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message

2012-06-28 Thread Janis Johnson
Test g++.dg/cpp0x/nullptr19.c contains the following:

char* k( char* );  /* { dg-message note } { dg-message note } */
nullptr_t k( nullptr_t ); /* { dg-message note } { dg-message note } */

Having two test directives on a line should have resulted in an ERROR
but the local replacement of dg-warning silently ignores errors
(something I plan to fix).  There are two notes for each of these lines,
identical but after different candidate lists.  Since they are identical
DejaGnu removes both of them after one has been processed, and there is
apparently no way to check for both of them.  At least with this patch
we'll correctly check for one for each line.

Tested on i686-pc-linux-gnu; OK for trunk?

Janis
2012-06-28  Janis Johnson  jani...@codesourcery.com

* g++.dg/cpp0x/nullptr19.c: Remove exta directives on same line.

Index: g++.dg/cpp0x/nullptr19.C
===
--- g++.dg/cpp0x/nullptr19.C(revision 189025)
+++ g++.dg/cpp0x/nullptr19.C(working copy)
@@ -5,8 +5,8 @@
 
 typedef decltype(nullptr) nullptr_t;
 
-char* k( char* );  /* { dg-message note } { dg-message note } */
-nullptr_t k( nullptr_t ); /* { dg-message note } { dg-message note } */
+char* k( char* );  /* { dg-message note } */
+nullptr_t k( nullptr_t ); /* { dg-message note } */
 
 void test_k()
 {


Re: [testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:57 PM, Janis Johnson wrote:
 Test g++.dg/cpp0x/nullptr19.c contains the following:

 OK for trunk?

Ok.


Re: [testsuite] add required comments to dg-message directives in g++.dg

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:56 PM, Janis Johnson wrote:
 Several tests in g++.dg use dg-message with a target list and line
 number but without the comment field, which is required when those
 additional arguments are used.

 OK for trunk?

Ok.


Re: [testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:55 PM, Janis Johnson wrote:
 Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of
 target and xfail selectors in a way that would be nice if it worked,

 OK for trunk?

Ok.  I prefer no spacing between the comment and the dg-message lines...  ok 
either way.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:15 PM, Bernhard Reutner-Fischer wrote:
 On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote:
 On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
 Perhaps you want to pursue this? We'd need to suggest this to dejagnu,
 
 Actually, we have the technology, so that isn't necessary.  :-)  You can 
 install replacements for any procs you want, not pretty, but... it does 
 work.  I think this is a more deterministic path forward than waiting for a 
 mythical dejagnu release.  Also, we then can avoid the hassle of requiring a 
 new dejagnu.
 
 Wouldn't that mean that we have to completely replace proc load_lib?

Yes; worse, it is a cut-n-paste from dejagnu and can effectively rev lock us to 
the current dejagnu release...  One can delegate, but I don't think any pre or 
post processing in this case is enough to `fix' the issue, so it would be a 
wholesale replacement.

 But anyway.
 Mike, it would be nice if you could fix
 +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load 
 gcc-defs!

Sounds like a single line fix.  It is the testing of that fix that is the 
annoying part.


Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 10:26 AM, Janis Johnson wrote:
 No, there is no way to combine target and xfail,

Ah...  Grrr  I hate non-composability.  Given that, I think the original 
patch is fine, subject of course to the wants and wishes of vect people.


Ping: Reorganized documentation for warnings -- attempt 2

2012-06-28 Thread David Stone
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01208.html


Re: New option to turn off stack reuse for temporaries

2012-06-28 Thread Xinliang David Li
(re-post in plain text)

Moving this to cfgexpand time is simple and it can also be extended to
handle scoped variables. However Jakub raised a good point about this
being too late as stack space overlay is not the only way to cause
trouble when the lifetime of a stack object is extended beyond the
clobber stmt.

thanks,

David

On Tue, Jun 26, 2012 at 1:28 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Mon, Jun 25, 2012 at 6:25 PM, Xinliang David Li davi...@google.com wrote:
 Are there any more concerns about this patch? If not, I'd like to check it 
 in.

 No - the fact that the flag is C++ specific but in common.opt is odd enough
 and -ftemp-reuse-stack sounds very very generic - which in fact it is not,
 it's a no-op in C.  Is there a more formal phrase for the temporary kind that
 is affected?  For me temp is synonymous to auto so I'd have expected
 the switch to turn off stack slot sharing for

  {
   int a[5];
  }
  {
   int a[5];
  }

 but that is not what it does.  So - a little kludgy but probably more to what
 I'd like it to be would be to move the option to c-family/c.opt enabled only
 for C++ and Obj-C++ and export it to the middle-end via a new langhook
 (the gimplifier code should be in Frontend code that lowers to GENERIC
 really and the WITH_CLEANUP_EXPR code should be C++ frontend specific ...).

 Thanks,
 Richard.

 thanks,

 David

 On Fri, Jun 22, 2012 at 8:51 AM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Jun 22, 2012 at 2:39 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Fri, Jun 22, 2012 at 11:29 AM, Jason Merrill ja...@redhat.com wrote:
 On 06/22/2012 01:30 AM, Richard Guenther wrote:

 What other issues? It enables more potential code motion, but on the
 other hand, causes more conservative stack reuse. As far I can tell,
 the handling of temporaries is added independently after the clobber
 for scoped variables are introduced. This option can be used to
 restore the older behavior (in handling temps).


 Well, it does not really restore the old behavior (if you mean before
 adding
 CLOBBERS, not before the single patch that might have used those for
 gimplifying WITH_CLEANUP_EXPR).  You say it disables stack-slot sharing
 for those decls but it also does other things via side-effects of no
 longer
 emitting the CLOBBER.  I say it's better to disable the stack-slot
 sharing.


 The patch exactly restores the behavior of temporaries from before my 
 change
 to add CLOBBERs for temporaries.  The primary effect of that change was to
 provide stack-slot sharing, but if there are other effects they are 
 probably
 desirable as well, since the broken code depended on the old behavior.

 So you see it as workaround option, like -fno-strict-aliasing, rather than
 debugging aid?

 It can be used for both purposes -- if the violations are as pervasive
 as strict-aliasing cases (which looks like so).

 thanks,

 David


 Richard.

 Jason


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Fri, Jun 29, 2012 at 12:00:10AM +0200, Bernhard Reutner-Fischer wrote:
 Really both HI? If so optab2 could be removed from that fn altogether..

Of course, thanks for pointing that out.  I've additionally added a result
mode check (similar to what supportable_widening_operation does).
The reason for not using supportable_widening_operation is that it only
tests even/odd calls for reductions, while we can use them everywhere.

Committed as obvious.

2012-06-29  Jakub Jelinek  ja...@redhat.com

* tree-vect-stmts.c (vectorizable_operation): Check both
VEC_WIDEN_MULT_LO_EXPR and VEC_WIDEN_MULT_HI_EXPR optabs.
Verify that operand[0]'s mode is TYPE_MODE (wide_vectype).

--- gcc/tree-vect-stmts.c   (revision 189053)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -3504,14 +3504,19 @@ vectorizable_operation (gimple stmt, gim
{
  decl1 = NULL_TREE;
  decl2 = NULL_TREE;
- optab = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
+ optab = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR,
   vectype, optab_default);
  optab2 = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
vectype, optab_default);
  if (optab != NULL
   optab2 != NULL
   optab_handler (optab, vec_mode) != CODE_FOR_nothing
-  optab_handler (optab2, vec_mode) != CODE_FOR_nothing)
+  optab_handler (optab2, vec_mode) != CODE_FOR_nothing
+  insn_data[optab_handler (optab, vec_mode)].operand[0].mode
+== TYPE_MODE (wide_vectype)
+  insn_data[optab_handler (optab2,
+ vec_mode)].operand[0].mode
+== TYPE_MODE (wide_vectype))
{
  for (i = 0; i  nunits_in; i++)
sel[i] = !BYTES_BIG_ENDIAN + 2 * i;


Jakub