Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Uros Bizjak
Hello!

index 000..5375b61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr53623.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { x86_64-*-* } } } */
+/* { dg-options -O2 -fdump-rtl-ree } */

Please use:

/* { dg-do compile { target { ! ia32 } } } */

Uros.


Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jakub Jelinek
On Thu, Dec 19, 2013 at 09:57:36PM -0700, Jeff Law wrote:
   * ree.c (combine_set_extension): Handle case where source
   and destination registers in an extension insn are different.
   (combine_reaching_defs): Allow source and destination
   registers in extension to be different under limited
   circumstances.
   (add_removable_extension): Remove restriction that the
   source and destination registers in the extension are the
   same.
   (find_and_remove_re): Emit a copy from the extension's
   destination to its source after the defining insn if
   the source and destination registers are different.
 
 
 testsuite/
 
   * gcc.target/i386/pr53623.c: New test.

Thanks for working on this, the only thing I'd worry about are
HARD_REGNO_NREGS  1 registers if the two hard regs might overlap.
Perhaps it is fine as is and dunno how many targets actually allow partial
overlap in between the multi-register REGs.  If you aren't sure this
would be handled properly, perhaps the check could go to
add_removable_extension below the REG_P check add
   !reg_overlap_mentioned_p (dest, XEXP (src, 0))

 diff --git a/gcc/ree.c b/gcc/ree.c
 index 9938e98..63ad86c 100644
 --- a/gcc/ree.c
 +++ b/gcc/ree.c
 @@ -282,9 +282,21 @@ static bool
  combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
  {
rtx orig_src = SET_SRC (*orig_set);
 -  rtx new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (*orig_set)));
rtx new_set;
  
 +  /* If the extension's source/destination registers are not the same
 + then we need to change the original load to reference the destination
 + of the extension.  Then we need to emit a copy from that destination
 + to the original destination of the load.  */
 +  rtx new_reg;
 +  bool copy_needed
 += REGNO (SET_DEST (PATTERN (cand-insn)))
 +  != REGNO (XEXP (SET_SRC (PATTERN (cand-insn)), 0));

Perhaps the right formatting here would be
  bool copy_needed
= (REGNO (SET_DEST (PATTERN (cand-insn)))
   != REGNO (XEXP (SET_SRC (PATTERN (cand-insn)), 0)));
? ()s for emacs, and aligning != under REGNO.

 +  if (copy_needed)
 +new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (PATTERN 
 (cand-insn;

Too long line.

Looks good to me otherwise.

Jakub


Re: [PATCH] Ubsan load of bool/enum sanitization

2013-12-20 Thread Jakub Jelinek
On Thu, Dec 19, 2013 at 10:22:38PM -0700, Jeff Law wrote:
 +  *gsi = create_cond_insert_point (gsi, /*before_p=*/true,
 +   /*then_more_likely_p=*/false,
 +   /*create_then_fallthru_edge=*/true,
 +   then_bb, fallthru_bb);
 Ick (comments embedded in argumust list).  Is there some compelling
 reason for those comments?

That is a style used heavily e.g. in the C++ frontend, just an attempt
to make the code slightly more readable what exactly you are passing.
But it isn't that important, so I'll remove it.

 OK with that trivial fix.

Thanks.

Jakub


[PING] RE: [PATCH] Vectorization for store with negative step

2013-12-20 Thread Bingfeng Mei
OK to commit? 

Thanks,
Bingfeng
-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
Behalf Of Bingfeng Mei
Sent: 18 December 2013 16:25
To: Jakub Jelinek
Cc: Richard Biener; gcc-patches@gcc.gnu.org
Subject: RE: [PATCH] Vectorization for store with negative step

Hi, Jakub,
Sorry for all the formatting issues. Haven't submit a patch for a while :-).
Please find the updated patch. 

Thanks,
Bingfeng

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: 18 December 2013 13:38
To: Bingfeng Mei
Cc: Richard Biener; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Vectorization for store with negative step

On Wed, Dec 18, 2013 at 01:31:05PM +, Bingfeng Mei wrote:
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 206016)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2013-12-18  Bingfeng Mei  b...@broadcom.com
+
+   PR tree-optimization/59544
+* tree-vect-stmts.c (perm_mask_for_reverse): Move before

This should be a tab instead of 8 spaces.

+   vectorizable_store. (vectorizable_store): Handle negative step.

Newline and tab after store., rather than space.

Property changes on: gcc/testsuite/gcc.target/i386/pr59544.c
___
Added: svn:executable
   + *

Please don't add such bogus property.  Testcases aren't executable.

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 206016)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2013-12-18  Bingfeng Mei  b...@broadcom.com
+
+   PR tree-optimization/59544
+   * gcc.target/i386/pr59544.c: New test

Missing dot at the end of line.
+
 2013-12-16  Jakub Jelinek  ja...@redhat.com
 
PR middle-end/58956
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 206016)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -4859,6 +4859,25 @@ ensure_base_align (stmt_vec_info stmt_in
 }
 
 
+/* Given a vector type VECTYPE returns the VECTOR_CST mask that implements
+   reversal of the vector elements.  If that is impossible to do,
+   returns NULL.  */
+
+static tree
+perm_mask_for_reverse (tree vectype)
+{
+  int i, nunits;
+  unsigned char *sel;
+
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  sel = XALLOCAVEC (unsigned char, nunits);
+
+  for (i = 0; i  nunits; ++i)
+sel[i] = nunits - 1 - i;
+
+  return vect_gen_perm_mask (vectype, sel);
+}
+
 /* Function vectorizable_store.
 
Check if STMT defines a non scalar data-ref (array/pointer/structure) that
@@ -4902,6 +4921,8 @@ vectorizable_store (gimple stmt, gimple_
   vectree oprnds = vNULL;
   vectree result_chain = vNULL;
   bool inv_p;
+  bool negative = false;
+  tree offset = NULL_TREE;
   vectree vec_oprnds = vNULL;
   bool slp = (slp_node != NULL);
   unsigned int vec_num;
@@ -4976,16 +4997,38 @@ vectorizable_store (gimple stmt, gimple_
   if (!STMT_VINFO_DATA_REF (stmt_info))
 return false;
 
-  if (tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
-   ? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
-   size_zero_node)  0)
+  negative = tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
+? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
+size_zero_node)  0;

The formatting looks wrong, do:
  negative
= tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
size_zero_node)  0;
instead.

+  if (negative  ncopies  1)
 {
   if (dump_enabled_p ())
 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
- negative step for store.\n);
+ multiple types with negative step.);
   return false;
 }
 
+  if (negative)
+{
+  gcc_assert (!grouped_store);
+  alignment_support_scheme = vect_supportable_dr_alignment (dr, false);
+  if (alignment_support_scheme != dr_aligned
+   alignment_support_scheme != dr_unaligned_supported)

Lots of places where you use 8 spaces instead of tab, please fix.
+offset = size_int (-TYPE_VECTOR_SUBPARTS (vectype) + 1);
+
   if (store_lanes_p)
 aggr_type = build_array_type_nelts (elem_type, vec_num * nunits);
   else
@@ -5200,7 +5246,7 @@ vectorizable_store (gimple stmt, gimple_
dataref_ptr
  = vect_create_data_ref_ptr (first_stmt, aggr_type,
  simd_lane_access_p ? loop : NULL,
- NULL_TREE, dummy, gsi, ptr_incr,
+ offset, dummy, gsi, ptr_incr,
   

Re: [PATCH][1/3] Re-submission of Altera Nios II port, gcc parts

2013-12-20 Thread Bernd Schmidt
On 11/26/2013 07:45 AM, Chung-Lin Tang wrote:

 +(define_insn movhi_internal
 +  [(set (match_operand:HI 0 nonimmediate_operand =m, r,r, r,r)
 +(match_operand:HI 1 general_operand   rM,m,rM,I,J))]

Didn't you say you'd removed the J alternative?

 +error (only register based stack limit is supported);

This one also ought to be a sorry.

Other than these two, I think this can go in.


Bernd




Re: [PATCH] Time profiler - phase 2

2013-12-20 Thread Dominique Dhumieres
 Hello,
there's updated version of the patch.
 
 Tested on x86_64 with enable bootstrap.
 
 Martin

This caused pr59541.

TIA

Dominique


Re: [RFC/CFT] auto-wipe dump files [was: Re: [committed] Fix up bb-slp-31.c testcase]

2013-12-20 Thread Bernhard Reutner-Fischer
On Thu, Oct 31, 2013 at 09:39:11AM +0100, Jakub Jelinek wrote:
 On Thu, Oct 31, 2013 at 09:34:41AM +0100, Bernhard Reutner-Fischer wrote:
  The cleanup routine would currently run 7 regexes on the incoming
  compiler-flags which is supposedly pretty fast.
  But yes, we could as well key off scan-dump. If we do that, i'd
  suggest to simply wipe all potential dumps, regardless of the family
  etc, like:
  $ltrans\[0-9\]\[0-9\]\[0-9\][itr].*
  What do you think?
 
 Many tests (e.g. in gcc.dg/vect/) pass -fdump-* flags and require cleanups,
 even if they don't have any scan directives.

Right.

Mike, attaching a new, slightly simplified patch.

I do not have time nor interest to persue this any further.
Ok for trunk at this stage?

Otherwise i would suggest to either drop this idea altogether or that
you take over if you think this is a nice thing to have for the next
stage1.

Ontop of this patch you would have to

git grep -l -E (cleanup-.*-dump|cleanup-saved-temps) | egrep -v 
(ChangeLog|/lib/) | sed -e s|[^/]*$|| | sort | uniq | \
while read d;
do
  find $d -type f -exec sed -i -e 
/cleanup-[^-]*[-]*dump/d;/cleanup-saved-temps/d {} +
done

diffstat -s:
 4874 files changed, 111 insertions(+), 5099 deletions(-)

Tested the same way as the initial incarnation against r205304 with no
regressions.
The ChangeLogs remain the same, i.e.:

gcc/testsuite/ChangeLog

2013-10-12  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* lib/gcc-dg.exp (cleanup-ipa-dump, cleanup-rtl-dump,
cleanup-tree-dump, cleanup-dump): Remove. Adjust all callers.
(schedule-cleanups): New proc.
(gcc-dg-test-1): Call it.
* lib/profopt.exp (profopt-execute): Likewise.
* g++.dg/cdce3.C: Adjust expected line numbers.
* gcc.dg/cdce1.c: Likewise.
* gcc.dg/cdce2.c: Likewise.
* gcc.dg/strlenopt-22.c: Fix comment delimiter.
* gcc.dg/strlenopt-24.c: Likewise.
* gcc.dg/tree-ssa/vrp26.c: Likewise.
* gcc.dg/tree-ssa/vrp28.c: Likewise.
* obj-c++.dg/encode-2.mm: Likewise.
* lib/dg-pch.exp(pch-init): Remove pch-check objects.

libgomp/ChangeLog

2013-10-12  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* testsuite/libgomp.graphite/bounds.c: Adjust for
cleanup-tree-dump removal.
* testsuite/libgomp.graphite/force-parallel-1.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-2.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-3.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-4.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-5.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-6.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-7.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-8.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-9.c: Likewise.
* testsuite/libgomp.graphite/pr41118.c: Likewise.


gcc/ChangeLog

2013-10-12  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* config/arm/neon-testgen.ml (emit_epilogue): Remove manual call
to cleanup-saved-temps.

gcc/doc/ChangeLog

2013-10-12  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* doc/sourcebuild.texi (Clean up generated test files): Expand
introduction.
(cleanup-ipa-dump, cleanup-rtl-dump, cleanup-tree-dump,
cleanup-saved-temps): Remove.

PS: As you can see, i had to touch a couple of files that had broken
comments like the vrp and strlenopt files. Should the testsuite
add -Wcomment per default?

cheers,
From ac5690774eb2134b063464be56bfc56826305d01 Mon Sep 17 00:00:00 2001
From: Bernhard Reutner-Fischer rep.dot@gmail.com
Date: Fri, 18 Oct 2013 21:08:49 +0200
Subject: [PATCH] auto-wipe dump files

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 gcc/config/arm/neon-testgen.ml|   1 -
 gcc/doc/sourcebuild.texi  |  19 ++---
 gcc/testsuite/g++.dg/cdce3.C  |   5 +-
 gcc/testsuite/gcc.dg/cdce1.c  |   3 +-
 gcc/testsuite/gcc.dg/cdce2.c  |   3 +-
 gcc/testsuite/gcc.dg/strlenopt-22.c   |   3 +-
 gcc/testsuite/gcc.dg/strlenopt-24.c   |   3 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp26.c |   3 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp28.c |   3 +-
 gcc/testsuite/lib/dg-pch.exp  |   2 +
 gcc/testsuite/lib/gcc-dg.exp  | 134 +++---
 gcc/testsuite/lib/profopt.exp |   3 +
 gcc/testsuite/obj-c++.dg/encode-2.mm  |   3 +-
 13 files changed, 111 insertions(+), 74 deletions(-)

diff --git a/gcc/config/arm/neon-testgen.ml b/gcc/config/arm/neon-testgen.ml
index 543318b..4734ac0 100644
--- a/gcc/config/arm/neon-testgen.ml
+++ b/gcc/config/arm/neon-testgen.ml
@@ -139,7 +139,6 @@ let emit_epilogue chan features regexps =
  else
()
 );
-Printf.fprintf chan /* { dg-final { cleanup-saved-temps } } */\n
 
 (* Check a list of C types to determine which ones are pointers and which
ones are const.  *)
diff 

RE: Two build != host fixes

2013-12-20 Thread Bernd Edlinger

 Date: Fri, 20 Dec 2013 07:57:02 +1030
 From: amo...@gmail.com
 To: bernd.edlin...@hotmail.de
 CC: gcc-patches@gcc.gnu.org; ja...@redhat.com; d...@redhat.com; 
 ebotca...@adacore.com
 Subject: Re: Two build != host fixes

 On Thu, Dec 19, 2013 at 11:50:02AM +0100, Bernd Edlinger wrote:
 Isn't the actual invocation of the build-g++ also including 
 /sysroot_for_host/include
 in that case? Why doesn't this cause problems then?

 Yes, and that causes failures too. BUILD_CPPFLAGS is the culprit.
 See http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01149.html

 --
 Alan Modra
 Australia Development Lab, IBM

Ok, now I understand:
The change with  GMPINC= is just incomplete, without the other patch.

When I apply the other patch too, I get this (obviously cleaner) build-g++ 
invocations:
g++ -c -DIN_GCC -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.9-20131215/gcc 
-I../../gcc-4.9-20131215/gcc/build -I../../gcc-4.9-20131215/gcc/../include 
-I../../gcc-4.9-20131215/gcc/../libcpp/include \
    -o build/gengtype.o ../../gcc-4.9-20131215/gcc/gengtype.c
flex  -ogengtype-lex.c ../../gcc-4.9-20131215/gcc/gengtype-lex.l  { \
  echo '#include bconfig.h' gengtype-lex.c.tmp; \
  cat gengtype-lex.c gengtype-lex.c.tmp; \
  mv gengtype-lex.c.tmp gengtype-lex.c; \
    }
g++ -c -DIN_GCC -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.9-20131215/gcc 
-I../../gcc-4.9-20131215/gcc/build -I../../gcc-4.9-20131215/gcc/../include 
-I../../gcc-4.9-20131215/gcc/../libcpp/include \
    -o build/gengtype-lex.o gengtype-lex.c
g++ -c -DIN_GCC -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.9-20131215/gcc 
-I../../gcc-4.9-20131215/gcc/build -I../../gcc-4.9-20131215/gcc/../include 
-I../../gcc-4.9-20131215/gcc/../libcpp/include \
    -o build/gengtype-parse.o 
../../gcc-4.9-20131215/gcc/gengtype-parse.c
g++ -c -DIN_GCC -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.9-20131215/gcc 
-I../../gcc-4.9-20131215/gcc/build -I../../gcc-4.9-20131215/gcc/../include 
-I../../gcc-4.9-20131215/gcc/../libcpp/include \
    -o build/gengtype-state.o 
../../gcc-4.9-20131215/gcc/gengtype-state.c


Regards
Bernd.

Re: Improving mklog [was: Re: RFC Asan instrumentation control]

2013-12-20 Thread Yury Gribov

Ultimately, mklog ought to write the ChangeLog itself.
We get rid of that headache, at least.


How about this then? Updated mklog now adds 'New file'/'New 
test'/'Remove' when necessary.


I did some tests with unified/context-diffed SVN and git and it worked 
as expected. I can do more testing if necessary.


-Y
diff --git a/contrib/mklog b/contrib/mklog
index d3f044e..fb0514f 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -43,7 +43,7 @@ chdir $gcc_root;
 # Program starts here. You should not need to edit anything below this
 # line.
 #-
-if ( $#ARGV != 0 ) {
+if ($#ARGV != 0) {
 $prog = `basename $0`; chop ($prog);
 print usage: $prog file.diff\n\n;
 print Adds a ChangeLog template to the start of file.diff\n;
@@ -56,40 +56,76 @@ $dir = `dirname $diff`; chop ($dir);
 $basename = `basename $diff`; chop ($basename);
 $hdrline = $date  $name  $addr;
 
-my %cl_entries;
+sub get_clname ($) {
+	return ('ChangeLog', $_[0]) if ($_[0] !~ /[\/\\]/);
 
-sub get_clname($) {
 	my $dirname = $_[0];
 	while ($dirname) {
 		my $clname = $dirname/ChangeLog;
 		if (-f $clname) {
-			my $filename_rel = substr ($_[0], length ($dirname) + 1);
-			return ($filename_rel, $clname);
+			my $relname = substr ($_[0], length ($dirname) + 1);
+			return ($clname, $relname);
 		} else {
 			$dirname =~ s/[\/\\]?[^\/\\]*$//;
 		} 
 	}
-	return ($_[0], 'Unknown Changelog');
+
+	return ('Unknown ChangeLog', $_[0]);
+}
+
+sub remove_suffixes ($) {
+	my $filename = $_[0];
+	$filename =~ s/^[ab]\///;
+	$filename =~ s/\.jj$//;
+	return $filename;
 }
 
 # For every file in the .diff print all the function names in ChangeLog
 # format.
-$bof = 0;
+%cl_entries = ();
+$change_msg = undef;
+$look_for_funs = 0;
 $clname = get_clname('');
 open (DFILE, $diff) or die Could not open file $diff for reading;
 while (DFILE) {
-# Check if we found a new file.
-if (/^\+\+\+ (b\/)?(\S+)/) {
+# Stop processing functions if we found a new file
+	# Remember both left and right names because one may be /dev/null.
+if (/^[+*][+*][+*] +(\S+)/) {
+		$left = remove_suffixes ($1);
+		$look_for_funs = 0;
+	}
+if (/^--- +(\S+)?/) {
+		$right = remove_suffixes ($1);
+		$look_for_funs = 0;
+	}
+
+	# Check if the body of diff started.
+	# We should now have both left and right name,
+	# so we can decide filename.
+
+if ($left  (/^\*{15}$/ || /^@@ /)) {
 	# If we have not seen any function names in the previous file (ie,
-	# $bof == 1), we just write out a ':' before starting the next
+	# $change_msg is empty), we just write out a ':' before starting the next
 	# file.
-	if ($bof == 1) {
-		$cl_entries{$clname} .= :\n;
+	if ($clname) {
+		$cl_entries{$clname} .= $change_msg ? $change_msg : :\n;
+	}
+
+	if ($left eq $right) {
+		$filename = $left;
+	} elsif($left eq '/dev/null') {
+		$filename = $right;
+	} elsif($right eq '/dev/null') {
+		$filename = $left;
+	} else {
+		print STDERR Error: failed to parse diff for $left and $right\n;
+		exit 1;
 	}
-	$filename = $2;
-	($filename_rel, $clname) = get_clname ($filename);
-	$cl_entries{$clname} .= \t* $filename_rel;
-	$bof = 1;
+	$left = $right = undef;
+	($clname, $relname) = get_clname ($filename);
+	$cl_entries{$clname} .= \t* $relname;
+	$change_msg = '';
+	$look_for_funs = $filename =~ '\.(c|cpp|C|cc|h|inc|def)$';
 }
 
 # Remember the last line in a unified diff block that might start
@@ -98,6 +134,22 @@ while (DFILE) {
 $save_fn = $1;
 }
 
+# Check if file is newly added.
+# Two patterns: for context and unified diff.
+if (/^\*\*\* 0 \*\*\*\*/
+|| /^@@ -0,0 \+1.* @@/) {
+$change_msg = $filename =~ /testsuite.*(?!\.exp)$/ ? : New test.\n : : New file.\n;
+$look_for_funs = 0;
+}
+
+# Check if file was removed.
+# Two patterns: for context and unified diff.
+if (/^--- 0 /
+|| /^@@ -1.* \+0,0 @@/) {
+$change_msg = : Remove.\n;
+$look_for_funs = 0;
+}
+
 # If we find a new function, print it in brackets.  Special case if
 # this is the first function in a file.  
 #
@@ -110,10 +162,11 @@ while (DFILE) {
 # The fourth pattern looks for the starts of functions or classes
 # within a unified diff block.
 
-if (/^\*\*\*\*\*\** ([a-zA-Z0-9_].*)/
+if ($look_for_funs
+ (/^\*\*\*\*\*\** ([a-zA-Z0-9_].*)/
 || /^[\-\+\!] ([a-zA-Z0-9_]+)[ \t]*\(.*/
 	|| /^@@ .* @@ ([a-zA-Z0-9_].*)/
-	|| /^[-+ ](\{)/)
+	|| /^[-+ ](\{)/))
   {
 	$_ = $1;
 	my $fn;
@@ -138,25 +191,24 @@ while (DFILE) {
 	# If this is the first function in the file, we display it next
 	# to the filename, so we need an extra space before the opening
 	# brace.
-	if ($bof) {
-		$cl_entries{$clname} .=  ;
-		$bof = 0;
+	if (!$change_msg) {
+		$change_msg .=  ;
 	} else {
-		$cl_entries{$clname} .= \t;
+		$change_msg .= \t;
 	}
 
-		$cl_entries{$clname} .= ($fn):\n;
+		

Re: [PATCH] fixincludes: use $(FI) instead of fixincl@EXEEXT@

2013-12-20 Thread Bernhard Reutner-Fischer
On 8 November 2013 17:28, Bruce Korb bk...@gnu.org wrote:
 Sure.  Looks good to me.  Thanks

pushed as r206146
thanks,

 On Fri, Nov 8, 2013 at 2:57 AM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
 On 4 April 2013 22:20, Bruce Korb bk...@gnu.org wrote:
 Except as noted below, fine by me.

 On 04/04/13 12:56, Bernhard Reutner-Fischer wrote:
 Bootstrapped and regtested on x86_64-unknown-linux-gnu and
 x86_64-mine-linux-uclibc without regressions, ok for trunk?

 fixincludes/ChangeLog:

 2013-04-04  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   Makefile.in: Use $(FI) instead of fixincl@EXEEXT@.
   Cleanup whitespace while at it.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  fixincludes/Makefile.in |   10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

 diff --git a/fixincludes/Makefile.in b/fixincludes/Makefile.in
 index ce850ff..3dc869d 100644
 --- a/fixincludes/Makefile.in
 +++ b/fixincludes/Makefile.in
 @@ -131,7 +131,7 @@ fixinc.sh : fixinc.in mkfixinc.sh Makefile
  $(srcdir)/fixincl.x: @MAINT@ fixincl.tpl inclhack.def
   cd $(srcdir) ; $(SHELL) ./genfixes

 -mostlyclean :
 +mostlyclean:

 I see no reason for changing things.

 dropped this hunk.

 But if you are going to clean up the colons, then they should
 be in columns (mostly 12 or 16).  This one is already in 12.

   rm -f *.o *-stamp $(AF) $(FI) *~ fixinc.sh

  clean: mostlyclean
 @@ -179,18 +179,18 @@ check : all

  install : all
   -rm -rf $(DESTDIR)$(itoolsdir)
 - $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
 + $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
   $(mkinstalldirs) $(DESTDIR)$(itoolsdatadir)/include
   $(INSTALL_DATA) $(srcdir)/README-fixinc \
 $(DESTDIR)$(itoolsdatadir)/include/README
   $(INSTALL_SCRIPT) fixinc.sh $(DESTDIR)$(itoolsdir)/fixinc.sh
 - $(INSTALL_PROGRAM) fixincl@EXEEXT@ \
 -   $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
 + $(INSTALL_PROGRAM) $(FI) \
 +   $(DESTDIR)$(itoolsdir)/$(FI)

 This should now fit on a single line.

 ok

   $(INSTALL_SCRIPT) mkheaders $(DESTDIR)$(itoolsdir)/mkheaders

  install-strip: install
   test -z '$(STRIP)' \
 -   || $(STRIP) $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
 +   || $(STRIP) $(DESTDIR)$(itoolsdir)/$(FI)

 changed this too to be on a single line now.

  .PHONY: all check install install-strip
  .PHONY: dvi pdf info html install-pdf install-info install-html

 Changelog remains the same.
 II was using the attached updated patch since April, ok for trunk?


Re: [PATCH 2/3] libstdc++-v3: ::tmpnam depends on uClibc SUSV4_LEGACY

2013-12-20 Thread Bernhard Reutner-Fischer
On 13 November 2013 18:56, Jonathan Wakely jwakely@gmail.com wrote:
 On 13 November 2013 09:22, Bernhard Reutner-Fischer wrote:
 On 11 November 2013 12:30, Jonathan Wakely jwakely@gmail.com wrote:
 How does __UCLIBC_SUSV4_LEGACY__ get defined?  We'd have a problem if
 users defined that at configure time but not later when using the
 library.
 That would be defined by uClibc's configury, but the latest
 commit-6f2faa2 i attached does not mention this anymore, but does
 the check in a libc-agnostic manner?

 Yes, but I was concerned about whether the value of that macro can
 change between configuring libstdc++ and users compiling code using
 libstdc++.  If it could change (e.g. by users compiling with
 -D_POSIX_C_SOURCE=200112L or some other feature test macro) then the
 value of _GLIBCXX_USE_TMPNAM (which doesn't change) would be
 unreliable and we could end up with a using ::tmpnam in the library
 that causes errors when users compile.

 If it's set when configuring uClibc then it is a constant for a given
 libstdc++ installation, so the value of _GLIBCXX_USE_TMPNAM is
 reliable.  In that case your change is OK to commit (with or without
 the XYZ change) - thanks.

It is a constant, yes. I will push this after another round of regtests
against current trunk as time permits.

thanks,


Re: [PATCH] Time profiler - phase 2

2013-12-20 Thread Iain Sandoe
Hi Martin,

Thanks for working on this!
 --- However you have introduced some problems including a bootstrap fail on 
darwin.

On 16 Dec 2013, at 10:13, Jan Hubicka wrote:

 Hello,
   there's updated version of the patch.
 
 Tested on x86_64 with enable bootstrap.
 
 Martin
 
 On 16 December 2013 00:21, Jan Hubicka hubi...@ucw.cz wrote:
 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 93e857df..d5a0ac8 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,14 @@
 +2013-12-15  Martin Liska  marxin.li...@gmail.com
 + Jan Hubicka  j...@suse.cz
 +
 + * cgraphunit.c (node_cmp): New function.
 + (expand_all_functions): Function ordering added.
 + * common.opt: New profile based function reordering flag introduced.
 + * lto-partition.c: Support for time profile added.
 + * lto.c: Likewise.
 + * predict.c (handle_missing_profiles): Time profile handled in
 +   missing profiles.
 +

There is no mention of config/darwin.c here ^


diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index ea056a9..4267c89 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -3621,9 +3621,16 @@ darwin_function_section (tree decl, enum node_frequency 
freq,
  unlikely executed (this happens especially with function splitting
  where we can split away unnecessary parts of static constructors).  */
   if (startup  freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
-return (weak)
-   ? darwin_sections[text_startup_coal_section]
-   : darwin_sections[text_startup_section];
+  {
+/* If we do have a profile or(and) LTO phase is executed, we do not need
+   these ELF section.  */
+if (!in_lto_p || !flag_profile_values)
+  return (weak)
+ ? darwin_sections[text_startup_coal_section]
+ : darwin_sections[text_startup_section];
+else
+  return text_section;
+  }
 
   /* Similarly for exit.  */
   if (exit  freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
@@ -3640,10 +3647,15 @@ darwin_function_section (tree decl, enum node_frequency 
freq,
: darwin_sections[text_cold_section];
break;
   case NODE_FREQUENCY_HOT:
-   return (weak)
-   ? darwin_sections[text_hot_coal_section]
-   : darwin_sections[text_hot_section];
-   break;
+  {
+/* If we do have a profile or(and) LTO phase is executed, we do not 
need
+   these ELF section.  */
+if (!in_lto_p || !flag_profile_values)
+  return (weak)
+  ? darwin_sections[text_hot_coal_section]
+  : darwin_sections[text_hot_section];
+break;
+  }
   default:
return (weak)
? darwin_sections[text_coal_section]

-=

This is NOT OK for darwin, it breaks bootstrap with  pr59541.

If one fixes that trivial issue - then there is a lot of test-suite fallout of 
profiled code.

Please explain what the logic is intended to implement - and ensure that all 
the code-paths have equivalent treatment - it all looks inconsitent to me at 
present.

I am sure that one of the darwin folks will be happy to review (or at least 
test) changes.

cheers
Iain



[PATCH, libiberty] Remove malloc/realloc from demangler (was: Add a couple of missing casts)

2013-12-20 Thread Gary Benson
Ian Lance Taylor wrote:
 On Wed, Nov 13, 2013 at 7:30 AM, Gary Benson gben...@redhat.com wrote:
  Richard Biener wrote:
   On Tue, Nov 12, 2013 at 8:55 PM, Ian Lance Taylor i...@google.com wrote:
On Tue, Nov 12, 2013 at 11:24 AM, Uros Bizjak ubiz...@gmail.com wrote:

 This was uncovered by x86 lto-profiledbootstrap. The patch allows
 lto-profiledbootstrap to proceed further.

 2013-11-12  Uros Bizjak  ubiz...@gmail.com

 * cp-demangle.c (d_copy_templates): Cast result of malloc
 to (struct d_print_template *).
 (d_print_comp): Cast result of realloc to (struct d_saved scope 
 *).

 Tested on x86_64-pc-linux-gnu.

 OK for mainline?
   
The patch is OK, but this code is troubling.  I obviously
should have looked at it earlier.  The C++ demangler is
sometimes used in panic situations, when malloc is not
available.  The interface was designed to be usable without
requiring malloc, by passing in a sufficiently large buffer.
I'm concerned that we apparently now require malloc to work.
  
   That indeed looks like an important regression - Gary, can you
   please work to fix this?
 
  I'm on it.
 
 Thanks.  See also the cplus_demangle_print_callback function.

The below patch should remedy this.

After the end of today I'll likely not check this email until
January 5.  If this patch gets approved before then and you
want it in please either ping me on gary@ the domain in my sig
or just somebody else commit it for me.

Thanks,
Gary

-- 
http://gbenson.net/

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 825ddd2..78c1433 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,20 @@
+2013-12-20  Gary Benson  gben...@redhat.com
+
+   * cp-demangle.c (struct d_print_info): New fields
+   next_saved_scope, copy_templates, next_copy_template and
+   num_copy_templates.
+   (d_count_templates): New function.
+   (d_print_init): New parameter dc.
+   Estimate numbers of templates and scopes required.
+   (d_print_free): Removed function.
+   (cplus_demangle_print_callback): Allocate stack for
+   templates and scopes.  Removed call to d_print_free.
+   (d_copy_templates): Removed function.
+   (d_save_scope): New function.
+   (d_get_saved_scope): Likewise.
+   (d_print_comp): Replace state saving/restoring code with
+   calls to d_save_scope and d_get_saved_scope.
+
 2013-11-22  Cary Coutant  ccout...@google.com
 
PR other/59195
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 029151e..de08d94 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -329,8 +329,16 @@ struct d_print_info
   unsigned long int flush_count;
   /* Array of saved scopes for evaluating substitutions.  */
   struct d_saved_scope *saved_scopes;
+  /* Index of the next unused saved scope in the above array.  */
+  int next_saved_scope;
   /* Number of saved scopes in the above array.  */
   int num_saved_scopes;
+  /* Array of templates for saving into scopes.  */
+  struct d_print_template *copy_templates;
+  /* Index of the next unused copy template in the above array.  */
+  int next_copy_template;
+  /* Number of copy templates in the above array.  */
+  int num_copy_templates;
   /* The nearest enclosing template, if any.  */
   const struct demangle_component *current_template;
 };
@@ -475,7 +483,8 @@ static void
 d_growable_string_callback_adapter (const char *, size_t, void *);
 
 static void
-d_print_init (struct d_print_info *, demangle_callbackref, void *);
+d_print_init (struct d_print_info *, demangle_callbackref, void *,
+ const struct demangle_component *);
 
 static inline void d_print_error (struct d_print_info *);
 
@@ -3770,11 +3779,141 @@ d_growable_string_callback_adapter (const char *s, 
size_t l, void *opaque)
   d_growable_string_append_buffer (dgs, s, l);
 }
 
+/* Walk the tree, counting the number of templates encountered, and
+   the number of times a scope might be saved.  These counts will be
+   used to allocate data structures for d_print_comp, so the logic
+   here must mirror the logic d_print_comp will use.  It is not
+   important that the resulting numbers are exact, so long as they
+   are larger than the actual numbers encountered.  */
+
+static void
+d_count_templates_scopes (int *num_templates, int *num_scopes,
+ const struct demangle_component *dc)
+{
+  if (dc == NULL)
+return;
+
+  switch (dc-type)
+{
+case DEMANGLE_COMPONENT_NAME:
+case DEMANGLE_COMPONENT_TEMPLATE_PARAM:
+case DEMANGLE_COMPONENT_FUNCTION_PARAM:
+case DEMANGLE_COMPONENT_SUB_STD:
+case DEMANGLE_COMPONENT_BUILTIN_TYPE:
+case DEMANGLE_COMPONENT_OPERATOR:
+case DEMANGLE_COMPONENT_CHARACTER:
+case DEMANGLE_COMPONENT_NUMBER:
+case DEMANGLE_COMPONENT_UNNAMED_TYPE:
+  break;
+
+case DEMANGLE_COMPONENT_TEMPLATE:
+  (*num_templates)++;

[committed] Add testcase for PR59413

2013-12-20 Thread Jakub Jelinek
Hi!

The bug in this PR has been introduced by my r204516 change and fixed
by r205884 (PR59417) fix.

I've committed the testcase as obvious so that we can close the PR.

2013-12-20  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/59413
* gcc.c-torture/execute/pr59413.c: New test.

--- gcc/testsuite/gcc.c-torture/execute/pr59413.c.jj2013-12-20 
13:59:46.518654547 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr59413.c   2013-12-20 
13:59:29.0 +0100
@@ -0,0 +1,21 @@
+/* PR tree-optimization/59413 */
+
+typedef unsigned int uint32_t;
+
+uint32_t a;
+int b;
+
+int
+main ()
+{
+  uint32_t c;
+  for (a = 7; a = 1; a++)
+{
+  char d = a;
+  c = d;
+  b = a == c;
+}
+  if (a != 7)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH] merge auto_vec and stack_vec

2013-12-20 Thread Richard Biener
On Fri, Dec 20, 2013 at 12:18 AM, Trevor Saunders
trev.saund...@gmail.com wrote:
 As discussed in http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02808.html

 bootstrap + same regression tests as previous rev, ok?

Ok.

Thanks,
Richard.

 2013-12-19  Trevor saunders  tsaund...@mozilla.com

 gcc/
 * vec.h (stack_vec): Convert to a templaate specialization of
 auto_vec.
 * config/i386/i386.c, df-scan.c, function.c, genautomata.c,
 gimplify.c, graphite-clast-to-gimple.c, graphite-dependences.c,
 graphite-scop-detection.c, graphite-sese-to-poly.c, hw-doloop.c,
 trans-mem.c, tree-call-cdce.c, tree-data-ref.c, tree-dfa.c,
 tree-if-conv.c, tree-inline.c, tree-loop-distribution.c,
 tree-parloops.c, tree-predcom.c, tree-ssa-alias.c,
 tree-ssa-loop-ivcanon.c, tree-ssa-phiopt.c, tree-ssa-threadedge.c,
 tree-ssa-uncprop.c, tree-vect-loop.c, tree-vect-patterns.c,
 tree-vect-slp.c, tree-vect-stmts.c, var-tracking.c: Adjust.

 cp/
 * semantics.c (build_anon_member_initialization): Replace
 stack_vecT, N with auto_vecT, N.

 ada/
 * gcc-interface/decl.c (components_to_record): Replace stack_vec with
 auto_vec.

 Trev


 diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
 index a80d1a9..ad129c6 100644
 --- a/gcc/ada/gcc-interface/decl.c
 +++ b/gcc/ada/gcc-interface/decl.c
 @@ -7010,7 +7010,7 @@ components_to_record (tree gnu_record_type, Node_Id 
 gnat_component_list,
tree gnu_union_type, gnu_union_name;
tree this_first_free_pos, gnu_variant_list = NULL_TREE;
bool union_field_needs_strict_alignment = false;
 -  stack_vec vinfo_t, 16 variant_types;
 +  auto_vec vinfo_t, 16 variant_types;
vinfo_t *gnu_variant;
unsigned int variants_align = 0;
unsigned int i;
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 5dde632..e3f8b4d 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -30809,7 +30809,7 @@ ix86_generate_version_dispatcher_body (void *node_p)

push_cfun (DECL_STRUCT_FUNCTION (resolver_decl));

 -  stack_vectree, 2 fn_ver_vec;
 +  auto_vectree, 2 fn_ver_vec;

for (versn_info = node_version_info-next; versn_info;
 versn_info = versn_info-next)
 diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
 index 7c1b18e..30426d8 100644
 --- a/gcc/cp/semantics.c
 +++ b/gcc/cp/semantics.c
 @@ -7439,7 +7439,7 @@ build_anon_member_initialization (tree member, tree 
 init,
   to build up the initializer from the outside in so that we can reuse
   previously built CONSTRUCTORs if this is, say, the second field in an
   anonymous struct.  So we use a vec as a stack.  */
 -  stack_vectree, 2 fields;
 +  auto_vectree, 2 fields;
do
  {
fields.safe_push (TREE_OPERAND (member, 1));
 diff --git a/gcc/df-scan.c b/gcc/df-scan.c
 index eb7e4d4..dcb4566 100644
 --- a/gcc/df-scan.c
 +++ b/gcc/df-scan.c
 @@ -86,10 +86,10 @@ static HARD_REG_SET elim_reg_set;

  struct df_collection_rec
  {
 -  stack_vecdf_ref, 128 def_vec;
 -  stack_vecdf_ref, 32 use_vec;
 -  stack_vecdf_ref, 32 eq_use_vec;
 -  stack_vecdf_mw_hardreg_ptr, 32 mw_vec;
 +  auto_vecdf_ref, 128 def_vec;
 +  auto_vecdf_ref, 32 use_vec;
 +  auto_vecdf_ref, 32 eq_use_vec;
 +  auto_vecdf_mw_hardreg_ptr, 32 mw_vec;
  };

  static df_ref df_null_ref_rec[1];
 diff --git a/gcc/function.c b/gcc/function.c
 index 2c8d781..95f7ed8 100644
 --- a/gcc/function.c
 +++ b/gcc/function.c
 @@ -4114,7 +4114,7 @@ reorder_blocks (void)
if (block == NULL_TREE)
  return;

 -  stack_vectree, 10 block_stack;
 +  auto_vectree, 10 block_stack;

/* Reset the TREE_ASM_WRITTEN bit for all blocks.  */
clear_block_marks (block);
 diff --git a/gcc/genautomata.c b/gcc/genautomata.c
 index 5580c69..aa05541 100644
 --- a/gcc/genautomata.c
 +++ b/gcc/genautomata.c
 @@ -3349,7 +3349,7 @@ uniq_sort_alt_states (alt_state_t alt_states_list)
if (alt_states_list-next_alt_state == 0)
  return alt_states_list;

 -  stack_vecalt_state_t, 150 alt_states;
 +  auto_vecalt_state_t, 150 alt_states;
for (curr_alt_state = alt_states_list;
 curr_alt_state != NULL;
 curr_alt_state = curr_alt_state-next_alt_state)
 @@ -5484,7 +5484,7 @@ form_ainsn_with_same_reservs (automaton_t automaton)
  {
ainsn_t curr_ainsn;
size_t i;
 -  stack_vecainsn_t, 150 last_insns;
 +  auto_vecainsn_t, 150 last_insns;

for (curr_ainsn = automaton-ainsn_list;
 curr_ainsn != NULL;
 @@ -,7 +,7 @@ make_automaton (automaton_t automaton)
state_t state;
state_t start_state;
state_t state2;
 -  stack_vecstate_t, 150 state_stack;
 +  auto_vecstate_t, 150 state_stack;
int states_n;
reserv_sets_t reservs_matter = form_reservs_matter (automaton);

 diff --git a/gcc/gimplify.c b/gcc/gimplify.c
 index 2e8c657..d51d1b8 100644
 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c
 @@ -1846,7 +1846,7 @@ gimplify_compound_lval 

[PATCH, nds32] Committed: Fix inaccurate alignment checking when passing BLKmode argument.

2013-12-20 Thread Chung-Ju Wu
Hi, all,

There is a problem in nds32.h to determine available register number
for passing BLKmode argument.  The original checking only refers to
NDS32_NEED_N_REGS_FOR_ARG macro but that is not sufficient to make
decision of using odd or even register number.  It is supposed to
further check the type alignment.

We define a new macro NDS32_MODE_TYPE_ALIGN and rewrite
NDS32_AVAILABLE_REGNUM_FOR_ARG definition.

The patch for nds32.c and nds32.h is as follow:


Index: gcc/config/nds32/nds32.h
===
--- gcc/config/nds32/nds32.h(revision 206139)
+++ gcc/config/nds32/nds32.h(working copy)
@@ -126,6 +126,11 @@
 #define NDS32_SINGLE_WORD_ALIGN_P(value) (((value)  0x03) == 0)
 #define NDS32_DOUBLE_WORD_ALIGN_P(value) (((value)  0x07) == 0)
 
+/* Get alignment according to mode or type information.
+   When 'type' is nonnull, there is no need to look at 'mode'.  */
+#define NDS32_MODE_TYPE_ALIGN(mode, type) \
+  (type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode))
+
 /* Round X up to the nearest double word.  */
 #define NDS32_ROUND_UP_DOUBLE_WORD(value)  (((value) + 7)  ~7)
 
@@ -142,12 +147,18 @@
 /* This macro is used to return the register number for passing argument.
We need to obey the following rules:
  1. If it is required MORE THAN one register,
-make sure the register number is a even value.
+we need to further check if it really needs to be
+aligned on double words.
+  a) If double word alignment is necessary,
+ the register number must be even value.
+  b) Otherwise, the register number can be odd or even value.
  2. If it is required ONLY one register,
 the register number can be odd or even value.  */
-#define NDS32_AVAILABLE_REGNUM_FOR_ARG(reg_offset, mode, type) \
-  ((NDS32_NEED_N_REGS_FOR_ARG (mode, type)  1)\
-   ? (((reg_offset) + NDS32_GPR_ARG_FIRST_REGNUM + 1)  ~1)\
+#define NDS32_AVAILABLE_REGNUM_FOR_ARG(reg_offset, mode, type)  \
+  ((NDS32_NEED_N_REGS_FOR_ARG (mode, type)  1) \
+   ? ((NDS32_MODE_TYPE_ALIGN (mode, type)  PARM_BOUNDARY)  \
+  ? (((reg_offset) + NDS32_GPR_ARG_FIRST_REGNUM + 1)  ~1)  \
+  : ((reg_offset) + NDS32_GPR_ARG_FIRST_REGNUM))\
: ((reg_offset) + NDS32_GPR_ARG_FIRST_REGNUM))
 
 /* This macro is to check if there are still available registers

Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c(revision 206139)
+++ gcc/config/nds32/nds32.c(working copy)
@@ -1438,8 +1438,8 @@
 {
   unsigned int align;
 
-  /* When 'type' is nonnull, there is no need to look at 'mode'.  */
-  align = (type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode));
+  /* Pick up the alignment according to the mode or type.  */
+  align = NDS32_MODE_TYPE_ALIGN (mode, type);
 
   return (align  PARM_BOUNDARY);
 }
@@ -1853,10 +1853,10 @@
   if (NDS32_ARG_PASS_IN_REG_P (cum-reg_offset, mode, type))
 {
   /* Pick up the next available register number.  */
-  return gen_rtx_REG (mode,
- NDS32_AVAILABLE_REGNUM_FOR_ARG (cum-reg_offset,
- mode,
- type));
+  unsigned int regno;
+
+  regno = NDS32_AVAILABLE_REGNUM_FOR_ARG (cum-reg_offset, mode, type);
+  return gen_rtx_REG (mode, regno);
 }
   else
 {


And the gcc/ChangeLog is as below:

+2013-12-20  Chung-Ju Wu  jasonw...@gmail.com
+
+   * config/nds32/nds32.h (NDS32_MODE_TYPE_ALIGN): New macro.
+   (NDS32_AVAILABLE_REGNUM_FOR_ARG): Use more accurate alignment checking
+   to determine available register number.
+   * config/nds32/nds32.c (nds32_needs_double_word_align): Use new
+   macro NDS32_MODE_TYPE_ALIGN.
+   (nds32_function_arg): Refine code layout.
+
 2013-12-19  Jeff Law  l...@redhat.com

* doc/invoke.texi: (dump-rtl-ree): Fix typo and clarify ree


Bootstrapped and tested on nds32le-elf/nds32be-elf target.
Committed as Rev.206142:
  http://gcc.gnu.org/r206142


Best regards,
jasonwucj


Re: [PING] RE: [PATCH] Vectorization for store with negative step

2013-12-20 Thread Richard Biener
On Fri, Dec 20, 2013 at 11:09 AM, Bingfeng Mei b...@broadcom.com wrote:
 OK to commit?

Ok.

Thanks,
Richard.

 Thanks,
 Bingfeng
 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
 Behalf Of Bingfeng Mei
 Sent: 18 December 2013 16:25
 To: Jakub Jelinek
 Cc: Richard Biener; gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH] Vectorization for store with negative step

 Hi, Jakub,
 Sorry for all the formatting issues. Haven't submit a patch for a while :-).
 Please find the updated patch.

 Thanks,
 Bingfeng

 -Original Message-
 From: Jakub Jelinek [mailto:ja...@redhat.com]
 Sent: 18 December 2013 13:38
 To: Bingfeng Mei
 Cc: Richard Biener; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Vectorization for store with negative step

 On Wed, Dec 18, 2013 at 01:31:05PM +, Bingfeng Mei wrote:
 Index: gcc/ChangeLog
 ===
 --- gcc/ChangeLog   (revision 206016)
 +++ gcc/ChangeLog   (working copy)
 @@ -1,3 +1,9 @@
 +2013-12-18  Bingfeng Mei  b...@broadcom.com
 +
 +   PR tree-optimization/59544
 +* tree-vect-stmts.c (perm_mask_for_reverse): Move before

 This should be a tab instead of 8 spaces.

 +   vectorizable_store. (vectorizable_store): Handle negative step.

 Newline and tab after store., rather than space.

 Property changes on: gcc/testsuite/gcc.target/i386/pr59544.c
 ___
 Added: svn:executable
+ *

 Please don't add such bogus property.  Testcases aren't executable.

 Index: gcc/testsuite/ChangeLog
 ===
 --- gcc/testsuite/ChangeLog (revision 206016)
 +++ gcc/testsuite/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2013-12-18  Bingfeng Mei  b...@broadcom.com
 +
 +   PR tree-optimization/59544
 +   * gcc.target/i386/pr59544.c: New test

 Missing dot at the end of line.
 +
  2013-12-16  Jakub Jelinek  ja...@redhat.com

 PR middle-end/58956
 Index: gcc/tree-vect-stmts.c
 ===
 --- gcc/tree-vect-stmts.c   (revision 206016)
 +++ gcc/tree-vect-stmts.c   (working copy)
 @@ -4859,6 +4859,25 @@ ensure_base_align (stmt_vec_info stmt_in
  }


 +/* Given a vector type VECTYPE returns the VECTOR_CST mask that implements
 +   reversal of the vector elements.  If that is impossible to do,
 +   returns NULL.  */
 +
 +static tree
 +perm_mask_for_reverse (tree vectype)
 +{
 +  int i, nunits;
 +  unsigned char *sel;
 +
 +  nunits = TYPE_VECTOR_SUBPARTS (vectype);
 +  sel = XALLOCAVEC (unsigned char, nunits);
 +
 +  for (i = 0; i  nunits; ++i)
 +sel[i] = nunits - 1 - i;
 +
 +  return vect_gen_perm_mask (vectype, sel);
 +}
 +
  /* Function vectorizable_store.

 Check if STMT defines a non scalar data-ref (array/pointer/structure) that
 @@ -4902,6 +4921,8 @@ vectorizable_store (gimple stmt, gimple_
vectree oprnds = vNULL;
vectree result_chain = vNULL;
bool inv_p;
 +  bool negative = false;
 +  tree offset = NULL_TREE;
vectree vec_oprnds = vNULL;
bool slp = (slp_node != NULL);
unsigned int vec_num;
 @@ -4976,16 +4997,38 @@ vectorizable_store (gimple stmt, gimple_
if (!STMT_VINFO_DATA_REF (stmt_info))
  return false;

 -  if (tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
 -   ? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
 -   size_zero_node)  0)
 +  negative = tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
 +? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
 +size_zero_node)  0;

 The formatting looks wrong, do:
   negative
 = tree_int_cst_compare (loop  nested_in_vect_loop_p (loop, stmt)
 ? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr),
 size_zero_node)  0;
 instead.

 +  if (negative  ncopies  1)
  {
if (dump_enabled_p ())
  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 - negative step for store.\n);
 + multiple types with negative step.);
return false;
  }

 +  if (negative)
 +{
 +  gcc_assert (!grouped_store);
 +  alignment_support_scheme = vect_supportable_dr_alignment (dr, false);
 +  if (alignment_support_scheme != dr_aligned
 +   alignment_support_scheme != dr_unaligned_supported)

 Lots of places where you use 8 spaces instead of tab, please fix.
 +offset = size_int (-TYPE_VECTOR_SUBPARTS (vectype) + 1);
 +
if (store_lanes_p)
  aggr_type = build_array_type_nelts (elem_type, vec_num * nunits);
else
 @@ -5200,7 +5246,7 @@ vectorizable_store (gimple stmt, gimple_
 dataref_ptr
   = vect_create_data_ref_ptr (first_stmt, aggr_type,
  

Re: [PATCH, ARM, v2] Fix PR target/59142: internal compiler error while compiling OpenCV 2.4.7

2013-12-20 Thread Richard Earnshaw
On 19/12/13 17:40, Charles Baylis wrote:
 On 19 December 2013 16:13, Richard Earnshaw rearn...@arm.com wrote:

 OK with that change.
 
 Thanks.
 
 The bugzilla entry is targeted at 4.8, but it is a latent problem
 which affects 4.7 too.
 
 Is it ok for 4.8, and should it be considered for 4.7?
 

Yes, provided it passes testing on those releases.

R.



Re: [PATCH][ARM] Implement CRC32 intrinsics for AArch32 in ARMv8-A

2013-12-20 Thread Kyrill Tkachov

On 19/12/13 17:58, Kyrill Tkachov wrote:

On 18/12/13 15:32, Ramana Radhakrishnan wrote:

On Tue, Dec 3, 2013 at 1:46 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Ping?
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02351.html

Thanks,
Kyrill

Ok if no objections in 24 hours.

Thanks Ramana, I've committed it as r206128 together with this obvious change
that sets the conds attribute on the md pattern.


I just noticed that I committed the first version of the patch posted at:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02250.html

instead of the second version posted at:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02351.html

that was approved. The difference is only that the second one has underscores 
under the variable names in arm_acle.h.


I've committed the attached patch to add them as obvious with r206149. Tested 
arm-none-eabi on a model.


Sorry for the noise,
Kyrill

2013-12-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm_acle.h: Add underscores before variables.



Kyrill



Ramana


On 26/11/13 09:44, Kyrill Tkachov wrote:

Ping?

Thanks,
Kyrill

On 19/11/13 17:04, Kyrill Tkachov wrote:

On 19/11/13 16:26, Joseph S. Myers wrote:

In any target header installed for user use, such as arm_acle.h, you
need
to be namespace-clean.  In this case, that means you need to use
implementation-namespace identifiers such as __a, __b and __d in case
the
user has defined macros with names such as a, b and d (unless the ACLE
says that identifiers a, b and d are in the implementation's namespace
when this header is included, which would be a very odd thing for it to
do).


Hi Joseph,

Thanks for the catch. ACLE doesn't expect a,b,d to be in the
implementation
namespace. I've added underscores before them.

Made sure tests pass.

Revised patch attached.
How's this?

Kyrill

gcc/
2013-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * Makefile.in (TEXI_GCC_FILES): Add arm-acle-intrinsics.texi.
 * config.gcc (extra_headers): Add arm_acle.h.
 * config/arm/arm.c (FL_CRC32): Define.
 (arm_have_crc): Likewise.
 (arm_option_override): Set arm_have_crc.
 (arm_builtins): Add CRC32 builtins.
 (bdesc_2arg): Likewise.
 (arm_init_crc32_builtins): New function.
 (arm_init_builtins): Initialise CRC32 builtins.
 (arm_file_start): Handle architecture extensions.
 * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define
__ARM_FEATURE_CRC32.
 Define __ARM_32BIT_STATE.
 (TARGET_CRC32): Define.
 * config/arm/arm-arches.def: Add armv8-a+crc.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm.md (type): Add crc.
 (crc_variant): New insn.
 * config/arm/arm_acle.h: New file.
 * config/arm/iterators.md (CRC): New int iterator.
 (crc_variant, crc_mode): New int attributes.
 * confg/arm/unspecs.md (UNSPEC_CRC32B, UNSPEC_CRC32H,
UNSPEC_CRC32W,
 UNSPEC_CRC32CB, UNSPEC_CRC32CH, UNSPEC_CRC32CW): New unspecs.
 * doc/invoke.texi: Document -march=armv8-a+crc option.
 * doc/extend.texi: Document ACLE intrinsics.
 * doc/arm-acle-intrinsics.texi: New.


gcc/testsuite
2013-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * lib/target-supports.exp (add_options_for_arm_crc): New
procedure.
 (check_effective_target_arm_crc_ok_nocache): Likewise.
 (check_effective_target_arm_crc_ok): Likewise.
 * gcc.target/arm/acle/: New directory.
 * gcc.target/arm/acle/acle.exp: New.
 * gcc.target/arm/acle/crc32b.c: New test.
 * gcc.target/arm/acle/crc32h.c: Likewise.
 * gcc.target/arm/acle/crc32w.c: Likewise.
 * gcc.target/arm/acle/crc32d.c: Likewise.
 * gcc.target/arm/acle/crc32cb.c: Likewise.
 * gcc.target/arm/acle/crc32ch.c: Likewise.
 * gcc.target/arm/acle/crc32cw.c: Likewise.
 * gcc.target/arm/acle/crc32cd.c: Likewise.


Index: gcc/config/arm/arm_acle.h
===
--- gcc/config/arm/arm_acle.h	(revision 206132)
+++ gcc/config/arm/arm_acle.h	(working copy)
@@ -34,60 +34,60 @@
 
 #ifdef __ARM_FEATURE_CRC32
 __extension__ static __inline uint32_t __attribute__ ((__always_inline__))
-__crc32b (uint32_t a, uint8_t b)
+__crc32b (uint32_t __a, uint8_t __b)
 {
-  return __builtin_arm_crc32b (a, b);
+  return __builtin_arm_crc32b (__a, __b);
 }
 
 __extension__ static __inline uint32_t __attribute__ ((__always_inline__))
-__crc32h (uint32_t a, uint16_t b)
+__crc32h (uint32_t __a, uint16_t __b)
 {
-  return __builtin_arm_crc32h (a, b);
+  return __builtin_arm_crc32h (__a, __b);
 }
 
 __extension__ static __inline uint32_t __attribute__ ((__always_inline__))
-__crc32w (uint32_t a, uint32_t b)
+__crc32w (uint32_t __a, uint32_t __b)
 {
-  return __builtin_arm_crc32w (a, b);
+  return __builtin_arm_crc32w (__a, __b);
 }
 
 #ifdef __ARM_32BIT_STATE
 __extension__ static __inline uint32_t 

[Patch] libgcov.c re-factoring

2013-12-20 Thread Teresa Johnson
On Tue, Dec 17, 2013 at 7:48 AM, Teresa Johnson tejohn...@google.com wrote:
 On Mon, Dec 16, 2013 at 2:48 PM, Xinliang David Li davi...@google.com wrote:
 Ok -- gcov_write_counter and gcov_write_tag_length are qualified as
 low level primitives for basic gcov format and probably should be kept
 in gcov-io.c.

 gcov_rewrite is petty much libgcov runtime implementation details so I
 think it should be moved out. gcov_write_summary is not related to
 gcov low level format either, neither is gcov_seek.  Ok for them to be
 moved?

 After looking at these some more, with the idea that gcov-io.c should
 encapsulate the lower level IO routines, then I think all of these
 (including gcov_rewrite) should remain in gcov-io.c. I think
 gcov_write_summary belongs there since all of the other gcov_write_*
 are there. And gcov_seek and gcov_rewrite are both adjusting gcov_var
 fields to affect the file IO operations. And there are currently no
 references to gcov_var within libgcc/libgcov* files.

 So I think we should leave the patch as-is. Honza, is the current
 patch ok for trunk?

Ping. Patch inlined below.

Thanks,
Teresa


2013-12-11  Rong Xu  x...@google.com

* gcc/gcov-io.c (gcov_var): Move from gcov-io.h.
(gcov_position): Ditto.
(gcov_is_error): Ditto.
(gcov_rewrite): Ditto.
* gcc/gcov-io.h: Refactor. Move gcov_var to gcov-io.h, and libgcov
only part to libgcc/libgcov.h.
* libgcc/libgcov-driver.c: Use libgcov.h.
(buffer_fn_data): Use xmalloc instead of malloc.
(gcov_exit_merge_gcda): Ditto.
* libgcc/libgcov-driver-system.c (allocate_filename_struct): Ditto.
* libgcc/libgcov.h: New common header files for libgcov-*.h.
* libgcc/libgcov-interface.c: Use libgcov.h
* libgcc/libgcov-merge.c: Ditto.
* libgcc/libgcov-profiler.c: Ditto.
* libgcc/Makefile.in: Add dependence to libgcov.h

Index: gcc/gcov-io.c
===
--- gcc/gcov-io.c   (revision 205895)
+++ gcc/gcov-io.c   (working copy)
@@ -36,6 +36,36 @@ static const gcov_unsigned_t *gcov_read_words (uns
 static void gcov_allocate (unsigned);
 #endif

+GCOV_LINKAGE struct gcov_var gcov_var;
+
+/* Save the current position in the gcov file.  */
+static inline gcov_position_t
+gcov_position (void)
+{
+  gcc_assert (gcov_var.mode  0);
+  return gcov_var.start + gcov_var.offset;
+}
+
+/* Return nonzero if the error flag is set.  */
+static inline int
+gcov_is_error (void)
+{
+  return gcov_var.file ? gcov_var.error : 1;
+}
+
+#if IN_LIBGCOV
+/* Move to beginning of file and initialize for writing.  */
+GCOV_LINKAGE inline void
+gcov_rewrite (void)
+{
+  gcc_assert (gcov_var.mode  0);
+  gcov_var.mode = -1;
+  gcov_var.start = 0;
+  gcov_var.offset = 0;
+  fseek (gcov_var.file, 0L, SEEK_SET);
+}
+#endif
+
 static inline gcov_unsigned_t from_file (gcov_unsigned_t value)
 {
 #if !IN_LIBGCOV
Index: gcc/gcov-io.h
===
--- gcc/gcov-io.h   (revision 205895)
+++ gcc/gcov-io.h   (working copy)
@@ -164,51 +164,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #ifndef GCC_GCOV_IO_H
 #define GCC_GCOV_IO_H

-#if IN_LIBGCOV
-/* About the target */
-
-#if BITS_PER_UNIT == 8
-typedef unsigned gcov_unsigned_t __attribute__ ((mode (SI)));
-typedef unsigned gcov_position_t __attribute__ ((mode (SI)));
-#if LONG_LONG_TYPE_SIZE  32
-typedef signed gcov_type __attribute__ ((mode (DI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (DI)));
-#else
-typedef signed gcov_type __attribute__ ((mode (SI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (SI)));
-#endif
-#else
-#if BITS_PER_UNIT == 16
-typedef unsigned gcov_unsigned_t __attribute__ ((mode (HI)));
-typedef unsigned gcov_position_t __attribute__ ((mode (HI)));
-#if LONG_LONG_TYPE_SIZE  32
-typedef signed gcov_type __attribute__ ((mode (SI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (SI)));
-#else
-typedef signed gcov_type __attribute__ ((mode (HI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (HI)));
-#endif
-#else
-typedef unsigned gcov_unsigned_t __attribute__ ((mode (QI)));
-typedef unsigned gcov_position_t __attribute__ ((mode (QI)));
-#if LONG_LONG_TYPE_SIZE  32
-typedef signed gcov_type __attribute__ ((mode (HI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (HI)));
-#else
-typedef signed gcov_type __attribute__ ((mode (QI)));
-typedef unsigned gcov_type_unsigned __attribute__ ((mode (QI)));
-#endif
-#endif
-#endif
-
-
-#if defined (TARGET_POSIX_IO)
-#define GCOV_LOCKED 1
-#else
-#define GCOV_LOCKED 0
-#endif
-
-#else /* !IN_LIBGCOV */
+#ifndef IN_LIBGCOV
 /* About the host */

 typedef unsigned gcov_unsigned_t;
@@ -231,48 +187,10 @@ typedef unsigned HOST_WIDEST_INT gcov_type_unsigne
 #define GCOV_LOCKED 0
 #endif

-#endif /* !IN_LIBGCOV */
-
-/* In gcov we want function linkage to be static.  In the 

[C++ PATCH] Don't ICE on TYPE_BINFO (PR c++/59111)

2013-12-20 Thread Marek Polacek
We ICEd on invalid testcases with auto, because lookup_conversions
got template_type_parm as a parameter and the TYPE_BINFO didn't like
it.  Fixed by checking for RECORD_OR_UNION_TYPE_P first.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-12-20  Marek Polacek  pola...@redhat.com

PR c++/59111
cp/
* search.c (lookup_conversions): Return NULL_TREE if
!RECORD_OR_UNION_TYPE_P.
testsuite/
* g++.dg/cpp0x/pr59111.C: New test.
* g++.dg/cpp1y/pr59110.C: New test.

--- gcc/cp/search.c.mp  2013-12-20 15:04:51.296753249 +0100
+++ gcc/cp/search.c 2013-12-20 15:04:55.552768259 +0100
@@ -2506,7 +2506,7 @@ lookup_conversions (tree type)
   tree list = NULL_TREE;
 
   complete_type (type);
-  if (!TYPE_BINFO (type))
+  if (!RECORD_OR_UNION_TYPE_P (type) || !TYPE_BINFO (type))
 return NULL_TREE;
 
   lookup_conversions_r (TYPE_BINFO (type), 0, 0,
--- gcc/testsuite/g++.dg/cpp0x/pr59111.C.mp 2013-12-20 14:44:16.871202015 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr59111.C2013-12-20 14:49:48.888403724 
+0100
@@ -0,0 +1,5 @@
+// PR c++/59111
+// { dg-do compile { target c++11 } }
+
+auto foo();   // { dg-error type specifier without trailing return type }
+int i = foo(); // { dg-error cannot convert }
--- gcc/testsuite/g++.dg/cpp1y/pr59110.C.mp 2013-12-20 14:51:17.402737711 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/pr59110.C2013-12-20 14:58:08.988234451 
+0100
@@ -0,0 +1,4 @@
+// PR c++/59110
+// { dg-options -std=c++1y }
+
+int i = *(auto*)0; // { dg-error cannot convert }

Marek


Re: [PATCH, libiberty] Remove malloc/realloc from demangler (was: Add a couple of missing casts)

2013-12-20 Thread Ian Lance Taylor
On Fri, Dec 20, 2013 at 5:00 AM, Gary Benson gben...@redhat.com wrote:

 --- a/libiberty/ChangeLog
 +++ b/libiberty/ChangeLog
 @@ -1,3 +1,20 @@
 +2013-12-20  Gary Benson  gben...@redhat.com
 +
 +   * cp-demangle.c (struct d_print_info): New fields
 +   next_saved_scope, copy_templates, next_copy_template and
 +   num_copy_templates.
 +   (d_count_templates): New function.
 +   (d_print_init): New parameter dc.
 +   Estimate numbers of templates and scopes required.
 +   (d_print_free): Removed function.
 +   (cplus_demangle_print_callback): Allocate stack for
 +   templates and scopes.  Removed call to d_print_free.
 +   (d_copy_templates): Removed function.
 +   (d_save_scope): New function.
 +   (d_get_saved_scope): Likewise.
 +   (d_print_comp): Replace state saving/restoring code with
 +   calls to d_save_scope and d_get_saved_scope.

This is OK.

Thanks for following up on this.

Ian


Re: [RFC][gomp4] Offloading patches (3/3): Add invocation of target compiler

2013-12-20 Thread Bernd Schmidt
On 12/17/2013 12:42 PM, Michael V. Zolotukhin wrote:
 Hi everybody,
 
 Here is a patch 3/3: Add invocation of target compiler.

 +  /* Run objcopy on TARGET_IMAGE_FILE_NAME.  */
 +  buf1 = (char*) xmalloc (strlen (.data=.)
 +   + strlen (OFFLOAD_IMAGE_SECTION_NAME) + 1);
 +  if (!buf1)
 +return NULL;
 +  sprintf (buf1, .data=%s, OFFLOAD_IMAGE_SECTION_NAME);
 +  obstack_init (argv_obstack);
 +  obstack_ptr_grow (argv_obstack, objcopy);
 +  obstack_ptr_grow (argv_obstack, -B);
 +  obstack_ptr_grow (argv_obstack, i386);
 +  obstack_ptr_grow (argv_obstack, -I);
 +  obstack_ptr_grow (argv_obstack, binary);
 +  obstack_ptr_grow (argv_obstack, -O);
 +  /* TODO: Properly handle 32-bit mode.  */
 +  obstack_ptr_grow (argv_obstack, elf64-x86-64);
 +  obstack_ptr_grow (argv_obstack, target_image_file_name);
 +  obstack_ptr_grow (argv_obstack, --rename-section);
 +  obstack_ptr_grow (argv_obstack, buf1);
 +  obstack_ptr_grow (argv_obstack, NULL);
 +
 +  argv = XOBFINISH (argv_obstack, const char **);

This patch seems to make rather too many assumptions about host and
target compilers. Certainly code like this can't go into
target-independent code like lto-wrapper. Also, I'm not sure you can
assume you'll get ELF files out of the OpenACC target compiler; I'd very
prefer a solution that doesn't rely on objcopy.


Bernd



Re: Improving mklog [was: Re: RFC Asan instrumentation control]

2013-12-20 Thread Diego Novillo

On 20/12/2013, 07:08 , Yury Gribov wrote:

Ultimately, mklog ought to write the ChangeLog itself.
We get rid of that headache, at least.


How about this then? Updated mklog now adds 'New file'/'New 
test'/'Remove' when necessary.


I did some tests with unified/context-diffed SVN and git and it worked 
as expected. I can do more testing if necessary.


This is OK. Thanks.


Diego.


[PATCH] Improve i?86/x86_64 prologue_and_epilogue for leaf functions (PR target/59501)

2013-12-20 Thread Jakub Jelinek
Hi!

Honza recently changed the i?86 backend, so that it often doesn't
do -maccumulate-outgoing-args by default on x86_64.
Unfortunately, on some of the here included testcases this regressed
quite a bit the generated code.  As AVX vectors are used, the dynamic
realignment code needs to assume e.g. that some of them will need to be
spilled, and for -mno-accumulate-outgoing-args the code needs to set
need_drap early as well.  But in when emitting the prologue/epilogue,
if need_drap is set, we don't perform the optimization for leaf functions
which have zero size stack frame, thus we end up with uselessly doing
dynamic stack realignment, setting up DRAP that nothing uses and later on
restore everything back.

This patch improves it, if the DRAP register isn't live at the start of
entry bb successor and we aren't going to realign the stack, we don't
need DRAP at all, and even if we need DRAP register, that can't be the sole
reason for doing stack realignment, the prologue code is able to set up DRAP
even without dynamic stack realignment.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-12-20  Jakub Jelinek  ja...@redhat.com

PR target/59501
* config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
if !crtl-stack_realign_needed.
(ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
and stack_realign_needed will be false, clear drap_reg and need_drap.
Optimize leaf functions that don't need stack frame even if
crtl-need_drap.

* gcc.target/i386/pr59501-1.c: New test.
* gcc.target/i386/pr59501-1a.c: New test.
* gcc.target/i386/pr59501-2.c: New test.
* gcc.target/i386/pr59501-2a.c: New test.
* gcc.target/i386/pr59501-3.c: New test.
* gcc.target/i386/pr59501-3a.c: New test.
* gcc.target/i386/pr59501-4.c: New test.
* gcc.target/i386/pr59501-4a.c: New test.
* gcc.target/i386/pr59501-5.c: New test.
* gcc.target/i386/pr59501-6.c: New test.

--- gcc/config/i386/i386.c.jj   2013-12-19 13:35:23.0 +0100
+++ gcc/config/i386/i386.c  2013-12-20 11:44:14.389310804 +0100
@@ -9235,7 +9235,9 @@ ix86_save_reg (unsigned int regno, bool
}
 }
 
-  if (crtl-drap_reg  regno == REGNO (crtl-drap_reg))
+  if (crtl-drap_reg
+   regno == REGNO (crtl-drap_reg)
+   crtl-stack_realign_needed)
 return true;
 
   return (df_regs_ever_live_p (regno)
@@ -10473,12 +10475,23 @@ ix86_finalize_stack_realign_flags (void)
   return;
 }
 
+  /* If drap has been set, but it actually isn't live at the start
+ of the function and !stack_realign, there is no reason to set it up.  */
+  if (crtl-drap_reg  !stack_realign)
+{
+  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb;
+  if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl-drap_reg)))
+   {
+ crtl-drap_reg = NULL_RTX;
+ crtl-need_drap = false;
+   }
+}
+
   /* If the only reason for frame_pointer_needed is that we conservatively
  assumed stack realignment might be needed, but in the end nothing that
  needed the stack alignment had been spilled, clear frame_pointer_needed
  and say we don't need stack realignment.  */
   if (stack_realign
-   !crtl-need_drap
frame_pointer_needed
crtl-is_leaf
flag_omit_frame_pointer
@@ -10516,6 +10529,18 @@ ix86_finalize_stack_realign_flags (void)
  }
}
 
+  /* If drap has been set, but it actually isn't live at the start
+of the function, there is no reason to set it up.  */
+  if (crtl-drap_reg)
+   {
+ basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb;
+ if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl-drap_reg)))
+   {
+ crtl-drap_reg = NULL_RTX;
+ crtl-need_drap = false;
+   }
+   }
+
   frame_pointer_needed = false;
   stack_realign = false;
   crtl-max_used_stack_slot_alignment = incoming_stack_boundary;
--- gcc/testsuite/gcc.target/i386/pr59501-2.c.jj2013-12-20 
12:02:08.754662741 +0100
+++ gcc/testsuite/gcc.target/i386/pr59501-2.c   2013-12-20 12:02:04.665668734 
+0100
@@ -0,0 +1,5 @@
+/* PR target/59501 */
+/* { dg-do run } */
+/* { dg-options -O2 -mavx -maccumulate-outgoing-args } */
+
+#include pr59501-1.c
--- gcc/testsuite/gcc.target/i386/pr59501-1.c.jj2013-12-20 
12:01:44.253781613 +0100
+++ gcc/testsuite/gcc.target/i386/pr59501-1.c   2013-12-20 12:12:26.715391613 
+0100
@@ -0,0 +1,30 @@
+/* PR target/59501 */
+/* { dg-do run } */
+/* { dg-options -O2 -mavx -mno-accumulate-outgoing-args } */
+
+#define CHECK_H avx-check.h
+#define TEST avx_test
+
+#include CHECK_H
+
+typedef double V __attribute__ ((vector_size (32)));
+
+__attribute__((noinline, noclone)) V
+foo (double *x, unsigned *y)
+{
+  V r = { x[y[0]], x[y[1]], x[y[2]], x[y[3]] };
+  return r;
+}
+
+static void
+TEST (void)
+{
+  double a[16];
+  

Re: [PATCH] Improve i?86/x86_64 prologue_and_epilogue for leaf functions (PR target/59501)

2013-12-20 Thread H.J. Lu
On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 Honza recently changed the i?86 backend, so that it often doesn't
 do -maccumulate-outgoing-args by default on x86_64.
 Unfortunately, on some of the here included testcases this regressed
 quite a bit the generated code.  As AVX vectors are used, the dynamic
 realignment code needs to assume e.g. that some of them will need to be
 spilled, and for -mno-accumulate-outgoing-args the code needs to set
 need_drap early as well.  But in when emitting the prologue/epilogue,
 if need_drap is set, we don't perform the optimization for leaf functions
 which have zero size stack frame, thus we end up with uselessly doing
 dynamic stack realignment, setting up DRAP that nothing uses and later on
 restore everything back.

 This patch improves it, if the DRAP register isn't live at the start of
 entry bb successor and we aren't going to realign the stack, we don't
 need DRAP at all, and even if we need DRAP register, that can't be the sole
 reason for doing stack realignment, the prologue code is able to set up DRAP
 even without dynamic stack realignment.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

 2013-12-20  Jakub Jelinek  ja...@redhat.com

 PR target/59501
 * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
 if !crtl-stack_realign_needed.
 (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
 and stack_realign_needed will be false, clear drap_reg and need_drap.
 Optimize leaf functions that don't need stack frame even if
 crtl-need_drap.

 * gcc.target/i386/pr59501-1.c: New test.
 * gcc.target/i386/pr59501-1a.c: New test.
 * gcc.target/i386/pr59501-2.c: New test.
 * gcc.target/i386/pr59501-2a.c: New test.
 * gcc.target/i386/pr59501-3.c: New test.
 * gcc.target/i386/pr59501-3a.c: New test.
 * gcc.target/i386/pr59501-4.c: New test.
 * gcc.target/i386/pr59501-4a.c: New test.
 * gcc.target/i386/pr59501-5.c: New test.
 * gcc.target/i386/pr59501-6.c: New test.

 --- gcc/config/i386/i386.c.jj   2013-12-19 13:35:23.0 +0100
 +++ gcc/config/i386/i386.c  2013-12-20 11:44:14.389310804 +0100
 @@ -9235,7 +9235,9 @@ ix86_save_reg (unsigned int regno, bool
 }
  }

 -  if (crtl-drap_reg  regno == REGNO (crtl-drap_reg))
 +  if (crtl-drap_reg
 +   regno == REGNO (crtl-drap_reg)
 +   crtl-stack_realign_needed)
  return true;

return (df_regs_ever_live_p (regno)
 @@ -10473,12 +10475,23 @@ ix86_finalize_stack_realign_flags (void)
return;
  }

 +  /* If drap has been set, but it actually isn't live at the start
 + of the function and !stack_realign, there is no reason to set it up.  */
 +  if (crtl-drap_reg  !stack_realign)
 +{
 +  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb;
 +  if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl-drap_reg)))
 +   {
 + crtl-drap_reg = NULL_RTX;
 + crtl-need_drap = false;
 +   }
 +}
 +
/* If the only reason for frame_pointer_needed is that we conservatively
   assumed stack realignment might be needed, but in the end nothing that
   needed the stack alignment had been spilled, clear frame_pointer_needed
   and say we don't need stack realignment.  */
if (stack_realign
 -   !crtl-need_drap
 frame_pointer_needed
 crtl-is_leaf
 flag_omit_frame_pointer
 @@ -10516,6 +10529,18 @@ ix86_finalize_stack_realign_flags (void)
   }
 }

 +  /* If drap has been set, but it actually isn't live at the start
 +of the function, there is no reason to set it up.  */
 +  if (crtl-drap_reg)
 +   {
 + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb;
 + if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl-drap_reg)))
 +   {
 + crtl-drap_reg = NULL_RTX;
 + crtl-need_drap = false;
 +   }
 +   }
 +
frame_pointer_needed = false;
stack_realign = false;
crtl-max_used_stack_slot_alignment = incoming_stack_boundary;

Looks good to me.  But I can't approve it.

Thanks.

-- 
H.J.


Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jeff Law

On 12/20/13 01:17, Uros Bizjak wrote:

Hello!

index 000..5375b61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr53623.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { x86_64-*-* } } } */
+/* { dg-options -O2 -fdump-rtl-ree } */

Please use:

/* { dg-do compile { target { ! ia32 } } } */

Done.

jeff


Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jeff Law

On 12/20/13 01:24, Jakub Jelinek wrote:


Thanks for working on this, the only thing I'd worry about are
HARD_REGNO_NREGS  1 registers if the two hard regs might overlap.
The reg_set_between_p and reg_used_between_p calls when you dig down 
into them eventually use reg_overlap_mentioned_p which should do the 
right thing in this regard.


I'll audit ree for problems of this nature.



+  /* If the extension's source/destination registers are not the same
+ then we need to change the original load to reference the destination
+ of the extension.  Then we need to emit a copy from that destination
+ to the original destination of the load.  */
+  rtx new_reg;
+  bool copy_needed
+= REGNO (SET_DEST (PATTERN (cand-insn)))
+!= REGNO (XEXP (SET_SRC (PATTERN (cand-insn)), 0));


Perhaps the right formatting here would be
   bool copy_needed
 = (REGNO (SET_DEST (PATTERN (cand-insn)))
!= REGNO (XEXP (SET_SRC (PATTERN (cand-insn)), 0)));
? ()s for emacs, and aligning != under REGNO.

Agreed.  I was factoring out that test and didn't fix the formatting.




+  if (copy_needed)
+new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (PATTERN 
(cand-insn;


Too long line.

Ugh.  80 exactly.  There are times I hate our conventions :(

jeff


[4.8, committed] Backport value-prof.c fix (PR c++/59255)

2013-12-20 Thread Jakub Jelinek
Hi!

I've backported this fix from the trunk to 4.8 branch and for the trunk
committed the new testcase.

2013-12-20  Jakub Jelinek  ja...@redhat.com

PR c++/59255
Backported from mainline
2013-08-19  Dehao Chen  de...@google.com

* value-prof.c (gimple_ic): Fix the bug of adding EH edge.

* g++.dg/tree-prof/pr59255.C: New test.

--- gcc/value-prof.c.jj 2013-03-16 08:14:33.0 +0100
+++ gcc/value-prof.c2013-12-20 17:19:44.821152952 +0100
@@ -1270,8 +1270,7 @@ gimple_ic (gimple icall_stmt, struct cgr
 
   /* Build an EH edge for the direct call if necessary.  */
   lp_nr = lookup_stmt_eh_lp (icall_stmt);
-  if (lp_nr != 0
-   stmt_could_throw_p (dcall_stmt))
+  if (lp_nr  0  stmt_could_throw_p (dcall_stmt))
 {
   edge e_eh, e;
   edge_iterator ei;
--- gcc/testsuite/g++.dg/tree-prof/pr59255.C.jj 2013-12-20 17:17:17.271928149 
+0100
+++ gcc/testsuite/g++.dg/tree-prof/pr59255.C2013-12-20 17:18:48.748446891 
+0100
@@ -0,0 +1,29 @@
+// PR c++/59255
+// { dg-options -O2 -std=c++11 }
+
+struct S
+{
+  __attribute__((noinline, noclone)) ~S () noexcept (true)
+  {
+if (fn)
+  fn (1);
+  }
+  void (*fn) (int);
+};
+
+__attribute__((noinline, noclone)) void
+foo (int x)
+{
+  if (x != 1)
+throw 1;
+}
+
+int
+main ()
+{
+  for (int i = 0; i  100; i++)
+{
+  S s;
+  s.fn = foo;
+}
+}

Jakub


Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jakub Jelinek
On Fri, Dec 20, 2013 at 09:26:10AM -0700, Jeff Law wrote:
 Thanks for working on this, the only thing I'd worry about are
 HARD_REGNO_NREGS  1 registers if the two hard regs might overlap.
 The reg_set_between_p and reg_used_between_p calls when you dig down
 into them eventually use reg_overlap_mentioned_p which should do the
 right thing in this regard.
 
 I'll audit ree for problems of this nature.

The two reg_*_between_p functions check only insns in between the two, but
not the insns themselves.  What I meant is if it e.g. couldn't be possible
to have HARD_REGNO_NREGS == 2 registers say 1 and 2, where the first insn
would load into the low half of 1, then one insn that say sign extends
1 into {2,3}, then perhaps {2,3} is used and finally 1 is zero extended
into {1,2} and that is used later on.  For little endian this would work,
while after the transformation which sign extends the memory into {2,3}
and then copies that into {1,2} (thus overwriting content of low half of
{2,3} with the high half of it).  Perhaps the RA will never allow this.

Jakub


Re: [RFC][gomp4] Offloading patches (3/3): Add invocation of target compiler

2013-12-20 Thread Michael V. Zolotukhin
 This patch seems to make rather too many assumptions about host and
 target compilers. Certainly code like this can't go into
 target-independent code like lto-wrapper.
That's true.  The point of this patch was to show what is needed to support
x86-MIC OpenMP offloading, as we currently see it.  We are ready to extend
existing code making it more versatile, but keeping this needed functionality.

 Also, I'm not sure you can
 assume you'll get ELF files out of the OpenACC target compiler; I'd very
 prefer a solution that doesn't rely on objcopy.
Yep, that's an issue I suspected but wasn't sure of.

The idea was to prepare an image in the following steps:
  1. Compile with target compiler.
  2. Post-process it (in our case, call objcopy and perform partial linking).
  3. Pass it to the host linker as a usual object file with the image and its
size placed in the known symbols with defined names (e.g.
_omp_target-name_image).

I would like to keep step 3 unchanged, while the steps 1 and 2 could be easily
combined into a single one.

In that case we could say that output of a target compiler should be an object
file for host linker with several symbols defined: _omp_target-name_image,
_omp_target-name_size, etc.

For x86-MIC one would use gcc+objcopy for this, and for OpenACC offloading one
could use gcc+some of other target-specific utils (if any of them needed).

Will this scheme work for OpenACC?

Michael
 Bernd


[PATCH][x86] march aliases

2013-12-20 Thread Ilya Tocar
  Perhaps we should add sandybridge, ivybridge and haswell aliases for
  corei7-avx, core-avx-i, core-avx2?  I mean, it is a nightmare to remember
  which one has the i7 in and which doesn't even for me.
 
 Yes please, I think this is a good idea.

I've added aliases for haswell, sandybridge, ivybridge, bonnell,
nehalem and silvermont.

BTW, I wonder if we add a bunch of new names to the table it isn't a right
time to also introduce macros for some common PTA_* flag combinations,

IMO full list of PTA_* helps quickly identify what is supported.

2013-12-20  Tocar Ilya  ilya.to...@intel.com 

* config/i386/i386.c (ix86_option_override_internal): Add
haswell, ivybridge, sandybridge, nehalem, bonnell, silvermont.
* doc/invoke.texi: Document them.
---
 gcc/config/i386/i386.c | 27 +++
 gcc/doc/invoke.texi| 32 
 2 files changed, 59 insertions(+)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1710e8c..fcf2afe 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3111,9 +3111,17 @@ ix86_option_override_internal (bool main_args_p,
   {core2, PROCESSOR_CORE2, CPU_CORE2,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_CX16 | PTA_FXSR},
+  {nehalem, PROCESSOR_COREI7, CPU_COREI7,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
+   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
   {corei7, PROCESSOR_COREI7, CPU_COREI7,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
| PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
+  {sandybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
+   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL
+   | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
   {corei7-avx, PROCESSOR_COREI7_AVX, CPU_COREI7,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
@@ -3124,6 +3132,11 @@ ix86_option_override_internal (bool main_args_p,
| PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
| PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
| PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
+  {ivybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
+   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
+   | PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
   {core-avx2, PROCESSOR_HASWELL, CPU_COREI7,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
@@ -3131,6 +3144,13 @@ ix86_option_override_internal (bool main_args_p,
| PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
| PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
| PTA_XSAVEOPT},
+  {haswell, PROCESSOR_HASWELL, CPU_COREI7,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
+   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
+   | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
+   | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
+   | PTA_XSAVEOPT},
   {broadwell, PROCESSOR_HASWELL, CPU_COREI7,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
@@ -3138,9 +3158,16 @@ ix86_option_override_internal (bool main_args_p,
| PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
| PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
| PTA_XSAVEOPT | PTA_ADX | PTA_PRFCHW | PTA_RDSEED},
+  {bonnell, PROCESSOR_ATOM, CPU_ATOM,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+   | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
   {atom, PROCESSOR_ATOM, CPU_ATOM,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
+  {silvermont, PROCESSOR_SLM, CPU_SLM,
+   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
+   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
+   | PTA_PCLMUL | PTA_RDRND | PTA_MOVBE | PTA_FXSR},
   {slm, PROCESSOR_SLM, CPU_SLM,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
| PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dcc1893..365ddbf 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14645,19 +14645,41 @@ SSE2 and SSE3 instruction set support.
 Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3
 instruction set 

Re: [PATCH][x86] march aliases

2013-12-20 Thread H.J. Lu
On Fri, Dec 20, 2013 at 8:47 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Perhaps we should add sandybridge, ivybridge and haswell aliases for
  corei7-avx, core-avx-i, core-avx2?  I mean, it is a nightmare to remember
  which one has the i7 in and which doesn't even for me.

 Yes please, I think this is a good idea.

 I've added aliases for haswell, sandybridge, ivybridge, bonnell,
 nehalem and silvermont.

BTW, I wonder if we add a bunch of new names to the table it isn't a right
time to also introduce macros for some common PTA_* flag combinations,

 IMO full list of PTA_* helps quickly identify what is supported.

 2013-12-20  Tocar Ilya  ilya.to...@intel.com

 * config/i386/i386.c (ix86_option_override_internal): Add
 haswell, ivybridge, sandybridge, nehalem, bonnell, silvermont.
 * doc/invoke.texi: Document them.
 ---
  gcc/config/i386/i386.c | 27 +++
  gcc/doc/invoke.texi| 32 
  2 files changed, 59 insertions(+)

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 1710e8c..fcf2afe 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -3111,9 +3111,17 @@ ix86_option_override_internal (bool main_args_p,
{core2, PROCESSOR_CORE2, CPU_CORE2,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_CX16 | PTA_FXSR},
 +  {nehalem, PROCESSOR_COREI7, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 +   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
{corei7, PROCESSOR_COREI7, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
 +  {sandybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL
 +   | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
{corei7-avx, PROCESSOR_COREI7_AVX, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 @@ -3124,6 +3132,11 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 | PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
 +  {ivybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 +   | PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
{core-avx2, PROCESSOR_HASWELL, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 @@ -3131,6 +3144,13 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 | PTA_XSAVEOPT},
 +  {haswell, PROCESSOR_HASWELL, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 +   | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 +   | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 +   | PTA_XSAVEOPT},
{broadwell, PROCESSOR_HASWELL, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 @@ -3138,9 +3158,16 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 | PTA_XSAVEOPT | PTA_ADX | PTA_PRFCHW | PTA_RDSEED},
 +  {bonnell, PROCESSOR_ATOM, CPU_ATOM,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
{atom, PROCESSOR_ATOM, CPU_ATOM,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
 +  {silvermont, PROCESSOR_SLM, CPU_SLM,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 +   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
 +   | PTA_PCLMUL | PTA_RDRND | PTA_MOVBE | PTA_FXSR},
{slm, PROCESSOR_SLM, CPU_SLM,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index dcc1893..365ddbf 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jeff Law

On 12/20/13 09:45, Jakub Jelinek wrote:

On Fri, Dec 20, 2013 at 09:26:10AM -0700, Jeff Law wrote:

Thanks for working on this, the only thing I'd worry about are
HARD_REGNO_NREGS  1 registers if the two hard regs might overlap.

The reg_set_between_p and reg_used_between_p calls when you dig down
into them eventually use reg_overlap_mentioned_p which should do the
right thing in this regard.

I'll audit ree for problems of this nature.


The two reg_*_between_p functions check only insns in between the two, but
not the insns themselves.  What I meant is if it e.g. couldn't be possible
to have HARD_REGNO_NREGS == 2 registers say 1 and 2, where the first insn
would load into the low half of 1, then one insn that say sign extends
1 into {2,3}, then perhaps {2,3} is used and finally 1 is zero extended
into {1,2} and that is used later on.  For little endian this would work,
while after the transformation which sign extends the memory into {2,3}
and then copies that into {1,2} (thus overwriting content of low half of
{2,3} with the high half of it).  Perhaps the RA will never allow this.
ISTM if we're presented with something like that (and I don't think 
there's anything in RA which explicitly disallows such code), then what 
we have to evaluate is whether or not the transformation preserves the 
semantics.


So, incoming would look like this (assuming a 32 bit target):


r1:SI = mem:SI
r2:DI = sext:DI (r1:SI)
[ Use r2/r3 ]
r1:DI = zext:DI (r1:SI)

And that would be transformed into:

r2:DI = sext:DI (mem:SI)
r1:DI = r2:DI
[ Use r2/r3 ]
r1:DI = zext:DI (r1:SI)

Where r2 will have the wrong value in the use statements.  ISTM we can 
check for an overlap between the destination of the memory load and the 
destination of the first extension.  Right?


Is that the case you're worrying about?

jeff


[RFC][gomp4] Offloading: Add device initialization and host-target function mapping

2013-12-20 Thread Ilya Verbin
Hi Jakub,

Could you please take a look at this patch for libgomp?

It adds new function GOMP_register_lib, that should be called from every
exec/lib with target regions (that was done in patch [1]).  This function
maintains the array of pointers to the target shared library descriptors.

Also this patch adds target device initialization into GOMP_target and
GOMP_target_data.  At first, it calls device_init function from the plugin.
This function takes array of target-images as input, and returns the array of
target-side addresses.  Currently, it always uses the first target-image from
the descriptor, this should be generalized later.  Then libgomp reads the tables
from host-side exec/libs.  After that, it inserts host-target address mapping
into the splay tree.

[1] http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html

Thanks,
-- Ilya

---
 libgomp/libgomp.map |1 +
 libgomp/target.c|  154 ---
 2 files changed, 146 insertions(+), 9 deletions(-)

diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2b64d05..792047f 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -208,6 +208,7 @@ GOMP_3.0 {
 
 GOMP_4.0 {
   global:
+   GOMP_register_lib;
GOMP_barrier_cancel;
GOMP_cancel;
GOMP_cancellation_point;
diff --git a/libgomp/target.c b/libgomp/target.c
index d84a1fa..a37819a 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -84,6 +84,19 @@ struct splay_tree_key_s {
   bool copy_from;
 };
 
+enum library_descr {
+  DESCR_TABLE_START,
+  DESCR_TABLE_END,
+  DESCR_IMAGE_START,
+  DESCR_IMAGE_END
+};
+
+/* Array of pointers to target shared library descriptors.  */
+static void **libraries;
+
+/* Total number of target shared libraries.  */
+static int num_libraries;
+
 /* Array of descriptors of all available devices.  */
 static struct gomp_device_descr *devices;
 
@@ -117,11 +130,16 @@ struct gomp_device_descr
  TARGET construct.  */
   int id;
 
+  /* Set to true when device is initialized.  */
+  bool is_initialized;
+
   /* Plugin file handler.  */
   void *plugin_handle;
 
   /* Function handlers.  */
-  bool (*device_available_func) (void);
+  bool (*device_available_func) (int);
+  void (*device_init_func) (void **, int *, int, void ***, int *);
+  void (*device_run_func) (void *, uintptr_t);
 
   /* Splay tree containing information about mapped memory regions.  */
   struct splay_tree_s dev_splay_tree;
@@ -466,6 +484,89 @@ gomp_update (struct gomp_device_descr *devicep, size_t 
mapnum,
   gomp_mutex_unlock (devicep-dev_env_lock);
 }
 
+void
+GOMP_register_lib (const void *openmp_target)
+{
+  libraries = realloc (libraries, (num_libraries + 1) * sizeof (void *));
+
+  if (libraries == NULL)
+return;
+
+  libraries[num_libraries] = (void *) openmp_target;
+
+  num_libraries++;
+}
+
+static void
+gomp_init_device (struct gomp_device_descr *devicep)
+{
+  void **target_images = malloc (num_libraries * sizeof (void *));
+  int *target_img_sizes = malloc (num_libraries * sizeof (int));
+  if (target_images == NULL || target_img_sizes == NULL)
+gomp_fatal (Can not allocate memory);
+
+  /* Collect target images from the library descriptors and calculate the total
+ size of host address table.  */
+  int i, host_table_size = 0;
+  for (i = 0; i  num_libraries; i++)
+{
+  void **lib = libraries[i];
+  void **host_table_start = lib[DESCR_TABLE_START];
+  void **host_table_end = lib[DESCR_TABLE_END];
+  /* FIXME: Select the proper target image.  */
+  target_images[i] = lib[DESCR_IMAGE_START];
+  target_img_sizes[i] = lib[DESCR_IMAGE_END] - lib[DESCR_IMAGE_START];
+  host_table_size += host_table_end - host_table_start;
+}
+
+  /* Initialize the target device and receive the address table from target.  
*/
+  void **target_table = NULL;
+  int target_table_size = 0;
+  devicep-device_init_func (target_images, target_img_sizes, num_libraries,
+target_table, target_table_size);
+  free (target_images);
+  free (target_img_sizes);
+
+  if (host_table_size != target_table_size)
+gomp_fatal (Can't map target objects);
+
+  /* Initialize the mapping data structure.  */
+  void **target_entry = target_table;
+  for (i = 0; i  num_libraries; i++)
+{
+  void **lib = libraries[i];
+  void **host_table_start = lib[DESCR_TABLE_START];
+  void **host_table_end = lib[DESCR_TABLE_END];
+  void **host_entry;
+  for (host_entry = host_table_start; host_entry  host_table_end;
+  host_entry += 2, target_entry += 2)
+   {
+ struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt));
+ tgt-refcount = 1;
+ tgt-array = gomp_malloc (sizeof (*tgt-array));
+ tgt-tgt_start = (uintptr_t) *target_entry;
+ tgt-tgt_end = tgt-tgt_start + (uint64_t) *(target_entry+1);
+ tgt-to_free = NULL;
+ tgt-list_count = 0;
+ tgt-device_descr = devicep;
+ 

Re: RFA (cgraph): C++ 'structor decloning patch, Mark III

2013-12-20 Thread Jason Merrill

On 12/13/2013 10:32 AM, Jan Hubicka wrote:

On 12/13/2013 05:58 AM, Jan Hubicka wrote:

Moreover when we turn comdat_local to false, we need to recompute also
function it is inlined into.


I don't see why.  If function A calls function B, which calls
comdat-local function C, A can be inlined, so why do we need to
recompute anything for A after we inline C into B?


The situation is whre B is inlined into A and it calls comdat
local C.  In this case both B and A have the flag set.  now
we inline C into B and we need to clean up the flag for A, too.

This can probably happen when you call the wrapper ctor from the
ctor itself?


I've added a testcase for that situation.  Recursive inlining ended up 
with an un-inlined call to the worker function from each of the 
wrappers; removing the special permission to inline a call to a 
calls_comdat_local function from the same comdat allows us to eliminate 
the worker function entirely, and avoids the need to clean up A.


Here's an updated patch that also addresses Martin's ipa-cp comment.

Jason

commit be1b04c77a420288e29c48c07e68c3ec87dd5b24
Author: Jason Merrill ja...@redhat.com
Date:   Thu Jan 12 14:04:42 2012 -0500

	PR c++/41090
	Add -fdeclone-ctor-dtor.
gcc/cp/
	* optimize.c (can_alias_cdtor, populate_clone_array): Split out
	from maybe_clone_body.
	(maybe_thunk_body): New function.
	(maybe_clone_body): Call it.
	* mangle.c (write_mangled_name): Remove code to suppress
	writing of mangled name for cloned constructor or destructor.
	(write_special_name_constructor): Handle decloned constructor.
	(write_special_name_destructor): Handle decloned destructor.
	* method.c (trivial_fn_p): Handle decloning.
	* semantics.c (expand_or_defer_fn_1): Clone after setting linkage.
gcc/c-family/
	* c.opt: Add -fdeclone-ctor-dtor.
	* c-opts.c (c_common_post_options): Default to on iff -Os.
gcc/
	* cgraph.h (struct cgraph_node): Add calls_comdat_local.
	(symtab_comdat_local_p, symtab_in_same_comdat_p): New.
	* cif-code.def: Add USES_COMDAT_LOCAL.
	* symtab.c (verify_symtab_base): Make sure we don't refer to a
	comdat-local symbol from outside its comdat.
	* cgraph.c (verify_cgraph_node): Likewise.
	* cgraphunit.c (mark_functions_to_output): Don't mark comdat-locals.
	* ipa.c (symtab_remove_unreachable_nodes): Likewise.
	(function_and_variable_visibility): Handle comdat-local fns.
	* ipa-cp.c (determine_versionability): Don't clone comdat-locals.
	* ipa-inline-analysis.c (compute_inline_parameters): Update
	calls_comdat_local.
	* ipa-inline-transform.c (inline_call): Likewise.
	(save_inline_function_body): Don't clear DECL_COMDAT_GROUP.
	* ipa-inline.c (can_inline_edge_p): Check calls_comdat_local.
	* lto-cgraph.c (input_overwrite_node): Read calls_comdat_local.
	(lto_output_node): Write it.
	* symtab.c (symtab_dissolve_same_comdat_group_list): Clear
	DECL_COMDAT_GROUP for comdat-locals.
include/
	* demangle.h (enum gnu_v3_ctor_kinds):
	Added literal gnu_v3_unified_ctor.
	(enum gnu_v3_ctor_kinds):
	Added literal gnu_v3_unified_dtor.
libiberty/
	* cp-demangle.c (cplus_demangle_fill_ctor,cplus_demangle_fill_dtor):
	Handle unified ctor/dtor.
	(d_ctor_dtor_name): Handle unified ctor/dtor.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index f368cab..3576f7d 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -899,6 +899,10 @@ c_common_post_options (const char **pfilename)
   if (warn_implicit_function_declaration == -1)
 warn_implicit_function_declaration = flag_isoc99;
 
+  /* Declone C++ 'structors if -Os.  */
+  if (flag_declone_ctor_dtor == -1)
+flag_declone_ctor_dtor = optimize_size;
+
   if (cxx_dialect = cxx11)
 {
   /* If we're allowing C++0x constructs, don't warn about C++98
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index bfca1e0..d270f77 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -890,6 +890,10 @@ fdeduce-init-list
 C++ ObjC++ Var(flag_deduce_init_list) Init(0)
 -fdeduce-init-list	enable deduction of std::initializer_list for a template type parameter from a brace-enclosed initializer-list
 
+fdeclone-ctor-dtor
+C++ ObjC++ Var(flag_declone_ctor_dtor) Init(-1)
+Factor complex constructors and destructors to favor space over speed
+
 fdefault-inline
 C++ ObjC++ Ignore
 Does nothing.  Preserved for backward compatibility.
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 9501afa..ccd150c 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2666,10 +2666,18 @@ verify_cgraph_node (struct cgraph_node *node)
 	  error_found = true;
 	}
 }
+  bool check_comdat = symtab_comdat_local_p (node);
   for (e = node-callers; e; e = e-next_caller)
 {
   if (verify_edge_count_and_frequency (e))
 	error_found = true;
+  if (check_comdat
+	   !symtab_in_same_comdat_p (e-caller, node))
+	{
+	  error (comdat-local function called by 

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jakub Jelinek
On Fri, Dec 20, 2013 at 10:17:10AM -0700, Jeff Law wrote:
 ISTM if we're presented with something like that (and I don't think
 there's anything in RA which explicitly disallows such code), then
 what we have to evaluate is whether or not the transformation
 preserves the semantics.
 
 So, incoming would look like this (assuming a 32 bit target):
 
 
 r1:SI = mem:SI
 r2:DI = sext:DI (r1:SI)
 [ Use r2/r3 ]
 r1:DI = zext:DI (r1:SI)
 
 And that would be transformed into:
 
 r2:DI = sext:DI (mem:SI)
 r1:DI = r2:DI
 [ Use r2/r3 ]
 r1:DI = zext:DI (r1:SI)
 
 Where r2 will have the wrong value in the use statements.  ISTM we
 can check for an overlap between the destination of the memory load
 and the destination of the first extension.  Right?
 
 Is that the case you're worrying about?

Yes.  So my suggestion actually was not correct for that:
   !reg_overlap_mentioned_p (dest, XEXP (src, 0))
because the first extension above has r1:SI and r2:DI which don't
overlap, only r1:DI and r2:DI overlap.  So it probably should be checked
in combine_reaching_defs instead where you have already both the registers
in the right modes available and can call reg_overlap_mentioned_p on them
directly.  One argument would be SET_DEST (def_insn) and one SET_DEST
(cand-insn), right?

Jakub


Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jeff Law

On 12/20/13 10:25, Jakub Jelinek wrote:
  So it probably should be checked

in combine_reaching_defs instead where you have already both the registers
in the right modes available and can call reg_overlap_mentioned_p on them
directly.  One argument would be SET_DEST (def_insn) and one SET_DEST
(cand-insn), right?
Certainly my preference is to have all the tests for this exception live 
in combine_reaching_defs.


We actually need to test in the widened mode, so we have to generate a 
new reg expressing SET_DEST (def_insn) in the widened mode.  The other 
argument for the reg_overlap_mentioned_p call is SET_DEST (cand-insn).


I find myself wondering if we should be using the widened mode register 
for the calls to reg_{used,set}_between_p calls.  I'll have to ponder 
that for a few minutes.


jeff




[RFA][PATCH][PR middle-end/59285] BARRIERS and merged blocks

2013-12-20 Thread Jeff Law


So here's an alternate approach to fixing 59285.  I still think 
attacking this in rtl_merge_blocks is better, but with nobody else 
chiming in to break the deadlock Steven and myself are in, I'll go with 
Steven's preferred solution (fix the callers in ifcvt.c).


If we were to return to a fix rtl_merge_blocks approach, I would 
revamp that patch to utilize the ideas in this one.  Namely that it's 
not just barriers between the merged blocks that are a problem.  In 
fact, that's a symptom of the problem.  Things have already gone wrong 
by that point.


Given blocks A  B that will be merged.  If A has  1 successor and B 
has no successors, the combined block will always have at least 1 
successor.  However, the combined block will be followed by a BARRIER 
that must be removed.



Bootstrapped and regression tested on arm-unknown-linux-gnu.  OK for the 
trunk?



* ifcvt.c (merge_if_block): If we are merging a block with more than
one successor with a block with no successors, remove any BARRIER
after the second block.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index ac0276c..8a4e01b 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3152,6 +3152,20 @@ merge_if_block (struct ce_if_block * ce_info)
 
   if (then_bb)
 {
+  /* If THEN_BB has no successors, then there's a BARRIER after it.
+If COMBO_BB has more than one successor (THEN_BB), then that BARRIER
+is no longer needed, and in fact it is incorrect to leave it in
+the insn stream.  */
+  if (EDGE_COUNT (then_bb-succs) == 0
+  EDGE_COUNT (combo_bb-succs)  1)
+   {
+ rtx end = NEXT_INSN (BB_END (then_bb));
+ while (end  NOTE_P (end)  !NOTE_INSN_BASIC_BLOCK_P (end))
+   end = NEXT_INSN (end);
+
+ if (end  BARRIER_P (end))
+   delete_insn (end);
+   }
   merge_blocks (combo_bb, then_bb);
   num_true_changes++;
 }
@@ -3161,6 +3175,20 @@ merge_if_block (struct ce_if_block * ce_info)
  get their addresses taken.  */
   if (else_bb)
 {
+  /* If ELSE_BB has no successors, then there's a BARRIER after it.
+If COMBO_BB has more than one successor (ELSE_BB), then that BARRIER
+is no longer needed, and in fact it is incorrect to leave it in
+the insn stream.  */
+  if (EDGE_COUNT (else_bb-succs) == 0
+  EDGE_COUNT (combo_bb-succs)  1)
+   {
+ rtx end = NEXT_INSN (BB_END (else_bb));
+ while (end  NOTE_P (end)  !NOTE_INSN_BASIC_BLOCK_P (end))
+   end = NEXT_INSN (end);
+
+ if (end  BARRIER_P (end))
+   delete_insn (end);
+   }
   merge_blocks (combo_bb, else_bb);
   num_true_changes++;
 }


[ARM] add armv7ve support

2013-12-20 Thread Renlin Li

Hi all,

This patch will add armv7ve support to gcc. Armv7ve is basically a 
armv7-a architecture profile with Virtualization Extensions. Additional 
test cases are also added.


With this patch and to keep backward compatibility with old assembler, 
the following asm header will be generated when -march=armv7ve option is 
presented.

.arch armv7-a
.arch_extension virt
.arch_extension idiv
.arch_extension sec
.arch_extension mp

This is a amendment to a previous patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02365.html
No new __ARM_ARCH_7VE__ is defined. Instead, __ARM_ARCH_7A__ is defined 
with additional extensions (e.g. __ARM_ARCH_EXT_IDIV__) when arch is set 
to armv7ve.



Okay for trunk?

Regards,
Renlin Li


gcc/ChangeLog:

2013-12-20  Renlin Li  renlin...@arm.com

* config.gcc:  Add armv7ve for --with-arch option.
* config/arm/arm-arches.def (ARM_ARCH): Add armv7ve arch.
* config/arm/arm.c (FL_FOR_ARCH7VE):  New.
(arm_file_start): Generate correct asm header for armv7ve.
* config/arm/bpabi.h:  Add multilib support for armv7ve.
* config/arm/driver-arm.c: Change the architectures of cortex-a7
and cortex-a15 to armv7ve.
* config/arm/t-aprofile: Add multilib support for armv7ve.
* doc/invoke.texi:  Docuemnt -march=armv7ve.

gcc/testsuite/ChangeLog:

2013-12-20  Renlin Li  renlin...@arm.com

* gcc.target/arm/ftest-armv7ve-arm.c: New.
* gcc.target/arm/ftest-armv7ve-thumb.c: New.
* lib/target-supports.exp: New armfunc, armflag and armdef for armv7ve.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8464d8f..34ae9c6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3459,7 +3459,7 @@ case ${target} in
 		 \
 		| armv[23456] | armv2a | armv3m | armv4t | armv5t \
 		| armv5te | armv6j |armv6k | armv6z | armv6zk | armv6-m \
-		| armv7 | armv7-a | armv7-r | armv7-m | armv8-a \
+		| armv7 | armv7-a | armv7ve | armv7-r | armv7-m | armv8-a \
 		| iwmmxt | ep9312)
 			# OK
 			;;
diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index fcf3401..c66bf8d 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -50,6 +50,7 @@ ARM_ARCH(armv6-m, cortexm1,	6M,			  FL_FOR_ARCH6M)
 ARM_ARCH(armv6s-m, cortexm1,	6M,			  FL_FOR_ARCH6M)
 ARM_ARCH(armv7,   cortexa8,	7,   FL_CO_PROC |	  FL_FOR_ARCH7)
 ARM_ARCH(armv7-a, cortexa8,	7A,  FL_CO_PROC |	  FL_FOR_ARCH7A)
+ARM_ARCH(armv7ve, cortexa8,	7A,  FL_CO_PROC |	  FL_FOR_ARCH7VE)
 ARM_ARCH(armv7-r, cortexr4,	7R,  FL_CO_PROC |	  FL_FOR_ARCH7R)
 ARM_ARCH(armv7-m, cortexm3,	7M,  FL_CO_PROC |	  FL_FOR_ARCH7M)
 ARM_ARCH(armv7e-m, cortexm4,  7EM, FL_CO_PROC |	  FL_FOR_ARCH7EM)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8fea2a6..1624a03 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -763,11 +763,11 @@ static int thumb_call_reg_needed;
 #define FL_FOR_ARCH6M	(FL_FOR_ARCH6  ~FL_NOTM)
 #define FL_FOR_ARCH7	((FL_FOR_ARCH6T2  ~FL_NOTM) | FL_ARCH7)
 #define FL_FOR_ARCH7A	(FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
+#define FL_FOR_ARCH7VE	(FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
 #define FL_FOR_ARCH7R	(FL_FOR_ARCH7A | FL_THUMB_DIV)
 #define FL_FOR_ARCH7M	(FL_FOR_ARCH7 | FL_THUMB_DIV)
 #define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
-#define FL_FOR_ARCH8A	(FL_FOR_ARCH7 | FL_ARCH6K | FL_ARCH8 | FL_THUMB_DIV \
-			 | FL_ARM_DIV | FL_NOTM)
+#define FL_FOR_ARCH8A	(FL_FOR_ARCH7VE | FL_ARCH8)
 
 /* The bits in this mask specify which
instructions we are allowed to generate.  */
@@ -27526,7 +27526,18 @@ arm_file_start (void)
 {
   const char *fpu_name;
   if (arm_selected_arch)
-	asm_fprintf (asm_out_file, \t.arch %s\n, arm_selected_arch-name);
+	/* Keep backward compatability for assemblers
+	   which don't support armv7ve.  */
+	  if (strncmp (arm_selected_arch-name, armv7ve, 7) == 0)
+	{
+	  asm_fprintf (asm_out_file, \t.arch armv7-a\n);
+	  asm_fprintf (asm_out_file, \t.arch_extension virt\n);
+	  asm_fprintf (asm_out_file, \t.arch_extension idiv\n);
+	  asm_fprintf (asm_out_file, \t.arch_extension sec\n);
+	  asm_fprintf (asm_out_file, \t.arch_extension mp\n);
+	}
+	  else
+	asm_fprintf (asm_out_file, \t.arch %s\n, arm_selected_arch-name);
   else if (strncmp (arm_selected_cpu-name, generic, 7) == 0)
 	asm_fprintf (asm_out_file, \t.arch %s\n, arm_selected_cpu-name + 8);
   else
diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index 5cfaeb8..aa449b1 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -66,6 +66,7 @@
|mcpu=cortex-a57	\
|mcpu=cortex-a57.cortex-a53\
|mcpu=generic-armv7-a\
+   |march=armv7ve	\
|march=armv7-m|mcpu=cortex-m3\
|march=armv7e-m|mcpu=cortex-m4   \
|march=armv6-m|mcpu=cortex-m0\
@@ -83,6 +84,7 @@

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jeff Law

On 12/20/13 10:25, Jakub Jelinek wrote:

Yes.  So my suggestion actually was not correct for that:
!reg_overlap_mentioned_p (dest, XEXP (src, 0))
because the first extension above has r1:SI and r2:DI which don't
overlap, only r1:DI and r2:DI overlap.  So it probably should be checked
in combine_reaching_defs instead where you have already both the registers
in the right modes available and can call reg_overlap_mentioned_p on them
directly.  One argument would be SET_DEST (def_insn) and one SET_DEST
(cand-insn), right?

Here's the updated version.

1. Minor test tweak per Uros's suggestion.
2. Fix formatting
3. Add testing for two destinations overlapping per above.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  Ok for 
the trunk?


jeff

* ree.c (combine_set_extension): Handle case where source
and destination registers in an extension insn are different.
(combine_reaching_defs): Allow source and destination
registers in extension to be different under limited
circumstances.
(add_removable_extension): Remove restriction that the
source and destination registers in the extension are the
same.
(find_and_remove_re): Emit a copy from the extension's
destination to its source after the defining insn if
the source and destination registers are different.


testsuite/

* gcc.target/i386/pr53623.c: New test.


diff --git a/gcc/ree.c b/gcc/ree.c
index 9938e98..873b6d4 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -282,8 +282,20 @@ static bool
 combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
 {
   rtx orig_src = SET_SRC (*orig_set);
-  rtx new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (*orig_set)));
   rtx new_set;
+  rtx cand_pat = PATTERN (cand-insn);
+
+  /* If the extension's source/destination registers are not the same
+ then we need to change the original load to reference the destination
+ of the extension.  Then we need to emit a copy from that destination
+ to the original destination of the load.  */
+  rtx new_reg;
+  bool copy_needed
+= (REGNO (SET_DEST (cand_pat)) != REGNO (XEXP (SET_SRC (cand_pat), 0)));
+  if (copy_needed)
+new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (cand_pat)));
+  else
+new_reg = gen_rtx_REG (cand-mode, REGNO (SET_DEST (*orig_set)));
 
   /* Merge constants by directly moving the constant into the register under
  some conditions.  Recall that RTL constants are sign-extended.  */
@@ -342,7 +354,8 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx 
*orig_set)
   if (dump_file)
 {
   fprintf (dump_file,
-  Tentatively merged extension with definition:\n);
+  Tentatively merged extension with definition %s:\n,
+  (copy_needed) ? (copy needed) : );
   print_rtl_single (dump_file, curr_insn);
 }
   return true;
@@ -662,6 +675,53 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, 
ext_state *state)
   if (!outcome)
 return false;
 
+  /* If the destination operand of the extension is a different
+ register than the source operand, then additional restrictions
+ are needed.  */
+  if ((REGNO (SET_DEST (PATTERN (cand-insn)))
+   != REGNO (XEXP (SET_SRC (PATTERN (cand-insn)), 0
+{
+  /* In theory we could handle more than one reaching def, it
+just makes the code to update the insn stream more complex.  */
+  if (state-defs_list.length () != 1)
+   return false;
+
+  /* We require the candidate not already be modified.  This may
+be overly conservative.  */
+  if (state-modified[INSN_UID (cand-insn)].kind != EXT_MODIFIED_NONE)
+   return false;
+
+  /* There's only one reaching def.  */
+  rtx def_insn = state-defs_list[0];
+
+  /* The defining statementmust not have been modified either.  */
+  if (state-modified[INSN_UID (def_insn)].kind != EXT_MODIFIED_NONE)
+   return false;
+
+  /* The defining statement and candidate insn must be in the same block.
+This is merely to keep the test for safety and updating the insn
+stream simple.  */
+  if (BLOCK_FOR_INSN (cand-insn) != BLOCK_FOR_INSN (def_insn))
+   return false;
+
+  /* If there is an overlap between the destination of DEF_INSN and
+CAND-insn, then this transformation is not safe.  Note we have
+to test in the widened mode.  */
+  rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand-insn))),
+REGNO (SET_DEST (PATTERN (def_insn;
+  if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand-insn
+   return false;
+
+  /* The destination register of the extension insn must not be
+used or set between the def_insn and cand-insn exclusive.  */
+  if (reg_used_between_p (SET_DEST (PATTERN (cand-insn)),
+ def_insn, cand-insn)

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2013-12-20 Thread Jakub Jelinek
On Fri, Dec 20, 2013 at 01:44:06PM -0700, Jeff Law wrote:
 @@ -342,7 +354,8 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx 
 *orig_set)
if (dump_file)
  {
fprintf (dump_file,
 -Tentatively merged extension with definition:\n);
 +Tentatively merged extension with definition %s:\n,
 +(copy_needed) ? (copy needed) : );

Missed this, you don't need the parens around the first copy_needed.

Ok with that change, thanks.

Jakub


[linaro/gcc-4_8-branch] Backports from trunk and merge from gcc-4_8-branch

2013-12-20 Thread Christophe Lyon
We have committed several backports from trunk to linaro/gcc-4_8-branch:

r203799 as r205740 (fix testcases for ARM hardfloat targets)
r203327 as r205742 (Enhance phiopt to handle BIT_AND_EXPR)
r204737 as r205743 (Make AArch64 frame grow downwards)
r202872 as r205744 ([ARM][testsuite] Add effective target check for
arm conditional execution)
r197997 as r205746 (enable libjava on AArch64)
r203774 as r205747 (enable libatomic on AArch64)

We have also merged the gcc-4_8-branch into linaro/gcc-4_8-branch up to
revision 205577 as r205893.

Christophe.


[patch] powerpc64 FreeBSD support for boehm-gc

2013-12-20 Thread Andreas Tobler
Hi,

the below patch adds support for powerpc64 FreeBSD for the boehm-gc.
The diff is already available in boehm-gc trunk.
Ok for gcc trunk?

Thanks,
Andreas

2013-12-20  Andreas Tobler  andre...@gcc.gnu.ch

* include/private/gcconfig.h: Add FreeBSD powerpc64 defines.


Index: include/private/gcconfig.h
===
--- include/private/gcconfig.h  (revision 206155)
+++ include/private/gcconfig.h  (working copy)
@@ -849,7 +849,15 @@
 # define NO_PTHREAD_TRYLOCK
 #   endif
 #   ifdef FREEBSD
+#   if defined(__powerpc64__)
+#   define ALIGNMENT 8
+#   define CPP_WORDSZ 64
+#   ifndef HBLKSIZE
+#   define HBLKSIZE 4096
+#   endif
+#   else
 #   define ALIGNMENT 4
+#   endif
 #   define OS_TYPE FREEBSD
 #   ifndef GC_FREEBSD_THREADS
 #   define MPROTECT_VDB


Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-12-20 Thread Rong Xu
Here are the results using our internal benchmarks which are a mixed a
multi-threaded and single-threaded programs.
This was collected about a month ago but I did not got time to send
due to an unexpected trip.

cmpxchg gives the worst performance due to the memory barriers it incurs.
I'll send a patch that supports conditional_1 and unconditional_1.

- result -

base: original_coverage
(1): using_conditional_1 -- using branch (my original implementation)
(2): using_unconfitional_1 -- write straight 1
(3): using_cmpxchg -- using compxchg write 1

Values are performance ratios where 100.0 equals the performance of
O2. Larger numbers are faster.
-- means the test failed due to running too slowly.

arch: westmere
  Benchmark   Base  (1)(2)  (3)
-
benchmark_126.4  +176.62%  +17.20%--
benchmark_2  --[78.4]   [12.3]--
benchmark_386.3+6.15%  +10.52%   -61.28%
benchmark_488.4+6.59%  +14.26%   -68.76%
benchmark_589.6+6.26%  +13.00%   -68.74%
benchmark_676.7   +22.28%  +29.15%   -75.31%
benchmark_789.0-0.62%   +3.36%   -71.37%
benchmark_884.5-1.45%   +5.27%   -74.04%
benchmark_981.3   +10.64%  +13.32%   -72.82%
benchmark_10   59.1   +44.71%  +14.77%   -73.24%
benchmark_11   90.3-1.74%   +4.22%   -61.95%
benchmark_12   98.9+0.07%   +0.48%-6.37%
benchmark_13   74.0-4.69%   +4.35%   -77.02%
benchmark_14   21.4  +309.92%  +63.41%   -35.82%
benchmark_15   21.4  +282.33%  +58.15%   -57.98%
benchmark_16   85.1-7.71%   +1.65%   -60.72%
benchmark_17   81.7+2.47%   +8.20%   -72.08%
benchmark_18   83.7+1.59%   +3.83%   -69.33%
geometric mean +30.30%  +14.41%  -65.66% (incomplete)

arch: sandybridge
  Benchmark   Base(1)   (2)  (3)
-
benchmark_1 --[70.1]   [26.1]   --
benchmark_2 --[79.1]   --   --
benchmark_3   84.3   +10.82%  +15.84%  -68.98%
benchmark_4   88.5   +10.28%  +11.35%  -75.10%
benchmark_5   89.4   +10.46%  +11.40%  -74.41%
benchmark_6   65.5   +38.52%  +44.46%  -77.97%
benchmark_7   87.7-0.16%   +1.74%  -76.19%
benchmark_8   89.6-4.52%   +6.29%  -78.10%
benchmark_9   79.9   +13.43%  +19.44%  -75.99%
benchmark_10  52.6   +61.53%   +8.23%  -78.41%
benchmark_11  89.9-1.40%   +3.37%  -68.16%
benchmark_12  99.0+1.51%   +0.63%  -10.37%
benchmark_13  74.3-6.75%   +3.89%  -81.84%
benchmark_14  21.8  +295.76%  +19.48%  -51.58%
benchmark_15  23.5  +257.20%  +29.33%  -83.53%
benchmark_16  84.4   -10.04%   +2.39%  -68.25%
benchmark_17  81.6+0.60%   +8.82%  -78.02%
benchmark_18  87.4-1.14%   +9.69%  -75.88%
geometric mean   +25.64%  +11.76%  -72.96% (incomplete)

arch: clovertown
  Benchmark Base   (1)   (2)(3)
--
benchmark_1 -- [83.4]-- --
benchmark_2 -- [82.3]-- --
benchmark_3   86.2 +7.58%   +13.10%-81.74%
benchmark_4   89.4 +5.69%   +11.70%-82.97%
benchmark_5   92.8 +4.67%+7.48%-80.02%
benchmark_6   78.1+13.28%   +22.21%-86.92%
benchmark_7   96.8 +0.25%+5.44%-84.94%
benchmark_8   89.1 +0.66%+3.60%-85.89%
benchmark_9   86.4 +8.42%+9.95%-82.30%
benchmark_10  59.7+44.95%   +21.79% --
benchmark_11  91.2 -0.29%+4.35%-76.05%
benchmark_12  99.0 +0.31%-0.05%-25.19%
benchmark_14   8.2  +1011.27%  +104.15% +5.56%
benchmark_15  11.7   +669.25%  +108.54%-29.83%
benchmark_16  85.7 -7.51%+4.43% --
benchmark_17  87.7 +2.84%+7.45% --
benchmark_18  87.9 +1.59%+3.82%-81.11%
geometric mean +37.89%   +17.54%  -74.47% (incomplete)

arch: istanbul

 Benchmark   Base(1)   (2)   (3)
--
benchmark_1 --[73.2]   -- --
benchmark_2 --[82.9]   -- --
benchmark_3   86.1+4.56%  +11.68%-61.04%
benchmark_4   92.0+3.47%   +4.63%-64.84%
benchmark_5   91.9+4.18%   +4.90%-64.77%
benchmark_6   73.6   +23.36%  +27.13%-72.64%
benchmark_7   93.6-3.57%   +4.76%-68.54%
benchmark_8   88.9-3.01%   +2.87%

Re: [PATCH][x86] march aliases

2013-12-20 Thread H.J. Lu
On Fri, Dec 20, 2013 at 8:55 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Fri, Dec 20, 2013 at 8:47 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Perhaps we should add sandybridge, ivybridge and haswell aliases for
  corei7-avx, core-avx-i, core-avx2?  I mean, it is a nightmare to remember
  which one has the i7 in and which doesn't even for me.

 Yes please, I think this is a good idea.

 I've added aliases for haswell, sandybridge, ivybridge, bonnell,
 nehalem and silvermont.

BTW, I wonder if we add a bunch of new names to the table it isn't a right
time to also introduce macros for some common PTA_* flag combinations,

 IMO full list of PTA_* helps quickly identify what is supported.

 2013-12-20  Tocar Ilya  ilya.to...@intel.com

 * config/i386/i386.c (ix86_option_override_internal): Add
 haswell, ivybridge, sandybridge, nehalem, bonnell, silvermont.
 * doc/invoke.texi: Document them.
 ---
  gcc/config/i386/i386.c | 27 +++
  gcc/doc/invoke.texi| 32 
  2 files changed, 59 insertions(+)

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 1710e8c..fcf2afe 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -3111,9 +3111,17 @@ ix86_option_override_internal (bool main_args_p,
{core2, PROCESSOR_CORE2, CPU_CORE2,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_CX16 | PTA_FXSR},
 +  {nehalem, PROCESSOR_COREI7, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 +   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
{corei7, PROCESSOR_COREI7, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR},
 +  {sandybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL
 +   | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
{corei7-avx, PROCESSOR_COREI7_AVX, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 @@ -3124,6 +3132,11 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 | PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
 +  {ivybridge, PROCESSOR_COREI7_AVX, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 +   | PTA_RDRND | PTA_F16C | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
{core-avx2, PROCESSOR_HASWELL, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 @@ -3131,6 +3144,13 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 | PTA_XSAVEOPT},
 +  {haswell, PROCESSOR_HASWELL, CPU_COREI7,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 +   | PTA_CX16 | PTA_POPCNT | PTA_AES | PTA_PCLMUL | PTA_FSGSBASE
 +   | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 +   | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 +   | PTA_XSAVEOPT},
{broadwell, PROCESSOR_HASWELL, CPU_COREI7,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX | PTA_AVX2
 @@ -3138,9 +3158,16 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_RDRND | PTA_F16C | PTA_BMI | PTA_BMI2 | PTA_LZCNT
 | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE | PTA_FXSR | PTA_XSAVE
 | PTA_XSAVEOPT | PTA_ADX | PTA_PRFCHW | PTA_RDSEED},
 +  {bonnell, PROCESSOR_ATOM, CPU_ATOM,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 +   | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
{atom, PROCESSOR_ATOM, CPU_ATOM,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
 | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR},
 +  {silvermont, PROCESSOR_SLM, CPU_SLM,
 +   PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 +   | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
 +   | PTA_PCLMUL | PTA_RDRND | PTA_MOVBE | PTA_FXSR},
{slm, PROCESSOR_SLM, CPU_SLM,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3
 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_AES
 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index 

Re: [PING] RE: [PATCH] Vectorization for store with negative step

2013-12-20 Thread H.J. Lu
On Fri, Dec 20, 2013 at 2:09 AM, Bingfeng Mei b...@broadcom.com wrote:
 OK to commit?

 Thanks,
 Bingfeng
 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
 Behalf Of Bingfeng Mei
 Sent: 18 December 2013 16:25
 To: Jakub Jelinek
 Cc: Richard Biener; gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH] Vectorization for store with negative step

 Hi, Jakub,
 Sorry for all the formatting issues. Haven't submit a patch for a while :-).
 Please find the updated patch.


It caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

-- 
H.J.


Re: [C++ PATCH] Don't ICE on TYPE_BINFO (PR c++/59111)

2013-12-20 Thread Jason Merrill

On 12/20/2013 09:29 AM, Marek Polacek wrote:

We ICEd on invalid testcases with auto, because lookup_conversions
got template_type_parm as a parameter and the TYPE_BINFO didn't like
it.  Fixed by checking for RECORD_OR_UNION_TYPE_P first.


Use CLASS_TYPE_P instead.  OK with that change.

Jason




Re: [PATCH] Convert more passes to new dump framework

2013-12-20 Thread Sharad Singhai
Committed documentation as r206161. Sorry about the delay.

Thanks,
Sharad

On Thu, Nov 28, 2013 at 10:03 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 10:18:05AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
  On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
  On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com 
  wrote:
   On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
   Hi,
  
   On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
   This patch ports messages to the new dump framework,
  
   It would be great this new framework was documented somewhere.  I lost
   track of what was agreed it would be and from the uses in the
   vectorizer I was never quite sure how to utilize it in other passes.
  
   Sharad, can you put the documentation in GCC wiki.
 
  Sure. I had user documentation in form of gcc info. But I will add
  more developer details to a GCC wiki.
 
 
  I have built trunk gccint.info yesterday but could not find any string
  dump_enabled_p there, for example.  And when I quickly searched just
  for the string dump, I did not find any thing that looked like
  dumping infrastructure either.  OTOH, I agree that fie would be the
  best place for the documentation.
 
  Or did I just miss it?  What section is it in then?

 Actually, the user-facing documentation is in doc/invoke.texi.
 However, that doesn't describe dump_enabled_p. Do you think
 gccint.info would be a good place? I can add documentation there
 instead of creating a GCC wiki.


 please do not forget about this, otherwise few people will use your
 framework.

 Thanks,

 Martin