Re: [PATCH 01/13] recog: Increased max number of alternatives - v2

2015-05-18 Thread Richard Biener
On Mon, May 18, 2015 at 3:41 PM, Andreas Krebbel
kreb...@linux.vnet.ibm.com wrote:
 The new version also changes the type for the alternative_mask to unsigned 
 HOST_WIDE_INT.

 Bootstrapped without regressions on x86-64.

 Ok to apply?

Please use uint64_t instead.

Richard.

 Bye,

 -Andreas-

 gcc/
 * recog.h: Increase MAX_RECOG_ALTERNATIVES.
 ---
  gcc/recog.h |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/gcc/recog.h b/gcc/recog.h
 index 463c748..07700bd 100644
 --- a/gcc/recog.h
 +++ b/gcc/recog.h
 @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
  /* Random number that should be large enough for all purposes.  Also define
 a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
 bit giving an invalid value that can be used to mean uninitialized.  */
 -#define MAX_RECOG_ALTERNATIVES 30
 -typedef unsigned int alternative_mask;
 +#define MAX_RECOG_ALTERNATIVES 35
 +typedef unsigned HOST_WIDE_INT alternative_mask;

  /* A mask of all alternatives.  */
  #define ALL_ALTERNATIVES ((alternative_mask) -1)



[PATCH][AArch64][PR 66136] rewrite geniterators.sh in awk

2015-05-18 Thread Szabolcs Nagy
Rewrote the generator script in awk, to avoid dealing with
sed portability issues.

gcc/Changelog:

2015-05-18  Szabolcs Nagy  szabolcs.n...@arm.com

PR target/66136
* config/aarch64/geniterators.sh: Rewrite in awk.
diff --git a/gcc/config/aarch64/geniterators.sh b/gcc/config/aarch64/geniterators.sh
index f908e89..5a51d29 100644
--- a/gcc/config/aarch64/geniterators.sh
+++ b/gcc/config/aarch64/geniterators.sh
@@ -22,25 +22,52 @@
 # Generate aarch64-builtin-iterators.h, a file containing a series of
 # BUILTIN_ITERATOR macros, which expand to VARN Macros covering the
 # same set of modes as the iterator in iterators.md
-
-echo /* -*- buffer-read-only: t -*- */
-echo /* Generated automatically by geniterators.sh from iterators.md.  */
-echo #ifndef GCC_AARCH64_ITERATORS_H
-echo #define GCC_AARCH64_ITERATORS_H
-
-# Strip newlines, create records marked ITERATOR, and strip junk (anything
-# which does not have a matching brace because it contains characters we
+#
+# Find the ITERATOR definitions (may span several lines), skip the ones
+# which does not have a simple format because it contains characters we
 # don't want to or can't handle (e.g P, PTR iterators change depending on
 # Pmode and ptr_mode).
-export LC_ALL=C
-cat $1 | tr \n   \
-   | sed 's/(define_mode_iterator \([A-Za-z0-9_]*\) \([]\[A-Z0-9 \t]*\)/\n#define BUILTIN_\1(T, N, MAP) \\ \2\n/g' \
-   | grep '#define [A-Z0-9_(), \\]* \[[A-Z0-9[:space:]]*]' \
-   | sed 's/\t//g' \
-   | sed 's/  */ /g' \
-   | sed 's/ \[\([A-Z0-9 ]*\)]/\n\1/' \
-   | awk ' BEGIN { FS =   ; OFS = , } \
-	   /#/ { print } \
-   ! /#/ { $1 = $1 ; printf   VAR%d (T, N, MAP, %s)\n, NF, tolower($0) }'
-
-echo #endif /* GCC_AARCH64_ITERATORS_H  */
+LC_ALL=C awk '
+BEGIN {
+	print /* -*- buffer-read-only: t -*- */
+	print /* Generated automatically by geniterators.sh from iterators.md.  */
+	print #ifndef GCC_AARCH64_ITERATORS_H
+	print #define GCC_AARCH64_ITERATORS_H
+}
+
+{
+	sub(/;.*/, )
+}
+
+iterdef {
+	s = s   $0
+}
+
+!iterdef  /\(define_mode_iterator/ {
+	iterdef = 1
+	s = $0
+	sub(/.*\(define_mode_iterator/, , s)
+}
+
+iterdef  s ~ /\)/ {
+	iterdef = 0
+
+	gsub(/[ \t]+/,  , s)
+	sub(/ *\).*/, , s)
+	sub(/^ /, , s)
+	if (s !~ /^[A-Za-z0-9_]+ \[[A-Z0-9 ]*\]$/)
+		next
+	sub(/\[ */, , s)
+	sub(/ *\]/, , s)
+
+	n = split(s, a)
+	printf #define BUILTIN_ a[1] (T, N, MAP) \\\n
+	printf   VAR (n-1)  (T, N, MAP
+	for (i = 2; i = n; i++)
+		printf ,   tolower(a[i])
+	printf )\n
+}
+
+END {
+	print #endif /* GCC_AARCH64_ITERATORS_H  */
+}' $1


[patch,gomp4] remove gwv test case

2015-05-18 Thread Cesar Philippidis
This patch removes the libgomp.oacc-c/gwv.c. test case. The original
idea behind this test was to allow us to create our own parallel loops
using by using the gang, worker and vector threads directly by using the
built-in GOACC_ thread functions. This test was supposed to ensure that
those functions work.

Now that we're working on a proper execution model, exposing those
functions to the end user doesn't make much sense. That's at least true
for workers and vectors because they start off in worker-single and
vector-single mode, respectively. Then there's the other issue of
variable broadcasting. As a result, this test case isn't necessary going
forward. Those thread functions are already being exercised by omp-low
for each acc loop anyway.

Cesar
2015-05-18  Cesar Philippidis  ce...@codesourcery.com

	libgomp/
	* testsuite/libgomp.oacc-c/gwv.c: Delete.


diff --git a/libgomp/testsuite/libgomp.oacc-c/gwv.c b/libgomp/testsuite/libgomp.oacc-c/gwv.c
deleted file mode 100644
index f8f1169..000
--- a/libgomp/testsuite/libgomp.oacc-c/gwv.c
+++ /dev/null
@@ -1,34 +0,0 @@
-#include stdlib.h
-#include openacc.h
-
-extern int __builtin_GOACC_ntid (int);
-extern int __builtin_GOACC_tid (int);
-extern int __builtin_GOACC_nctaid (int);
-extern int __builtin_GOACC_ctaid (int);
-
-int main ()
-{
-  int gangs = -1, vectors = -1;
-  int GANGS = 100, VECTORS = 64;
-
-#pragma acc parallel num_gangs (GANGS) vector_length (VECTORS) copyout (gangs, vectors)
-  {
-gangs = __builtin_GOACC_nctaid (0);
-vectors = __builtin_GOACC_ntid (0);
-  }
-
-  if (acc_get_device_type () == acc_device_nvidia)
-{
-  if (gangs != GANGS)
-	abort ();
-   if (vectors != VECTORS)
- 	abort ();
-}
-  else
-{
-  if (gangs != 0)
-	abort ();
-   if (vectors != 0)
-	abort ();
-}
-}


RE: [PATCH, MIPS]: Fix internal compiler error: in check_bool_attrs, at recog.c:2218 for micromips attribute

2015-05-18 Thread Maciej W. Rozycki
On Mon, 18 May 2015, Robert Suchanek wrote:

  Just to check, the reason we don't see this in the current testsuite is that
  there is no test with both MIPS and microMIPS functions?
 
 Correct.  The testsuite either tests microMIPS or MIPS but not both in 
 the same TU.

 We could add -mflip-micromips complementing -mflip-mips16 and use that 
for testing too.  Chances are it'd reveal further issues.  Looking at how 
-mflip-mips16 has been implemented it does not appear to me adding 
-mflip-micromips would be a lot of effort.

  Maciej


Re: [patch,gomp4] remove gwv test case

2015-05-18 Thread Jakub Jelinek
On Mon, May 18, 2015 at 07:57:22AM -0700, Cesar Philippidis wrote:
 This patch removes the libgomp.oacc-c/gwv.c. test case. The original
 idea behind this test was to allow us to create our own parallel loops
 using by using the gang, worker and vector threads directly by using the
 built-in GOACC_ thread functions. This test was supposed to ensure that
 those functions work.
 
 Now that we're working on a proper execution model, exposing those
 functions to the end user doesn't make much sense. That's at least true
 for workers and vectors because they start off in worker-single and
 vector-single mode, respectively. Then there's the other issue of
 variable broadcasting. As a result, this test case isn't necessary going
 forward. Those thread functions are already being exercised by omp-low
 for each acc loop anyway.
 
 Cesar

 2015-05-18  Cesar Philippidis  ce...@codesourcery.com
 
   libgomp/
   * testsuite/libgomp.oacc-c/gwv.c: Delete.

Ok with me.

Jakub


Re: Check canonical types in verify_type

2015-05-18 Thread Jan Hubicka
  +  if (!comp_type_attributes (t1, t2))
  +   return false;
  
  Because I think the TBAA does not care about attribute lists.  I suppose 
  this
  is kind-of harmless because of:
 
 I think that was about attributes like 'packed', but those should have
 their effects applied at stor-layout time.  So yes, I think this can be
 dropped.

It does test the attribute list for functions only.  I think packed should
be detected from memory layout, so I will test patch dropping this.
 
  +  /* For canonical type comparisons we do not want to build SCCs
  +so we cannot compare pointed-to types.  But we can, for now,
  +require the same pointed-to type kind and match what
  +useless_type_conversion_p would do.  */
  +  if (POINTER_TYPE_P (t1))
  +   {
  + if (TYPE_ADDR_SPACE (TREE_TYPE (t1))
  + != TYPE_ADDR_SPACE (TREE_TYPE (t2)))
  +   return false;
  +
  + if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2)))
  +   return false;
  +   }
  
  Is the reason for not making differences between differnt pointer types
  really just lazyness to not deal with SCCs?  For that it is easy to add
  set of visited type pairs, just like odr_types_equivalent_p does.
 
 Well - TBAA doesn't care about pointed-to types.  See get_alias_set
 
   else if (POINTER_TYPE_P (t)
 t != ptr_type_node)
 set = get_alias_set (ptr_type_node);

Hmm, I see, interesting hack.  For the first part of comment, I see that
qualifiers needs to be ignored, but I do not see why we put
short * and int * pointers to same class.

The complete/incomplete type fun with LTO can be solved for C++ but indeed I
see why pointer to incomplete type needs to be considered compatible with every
other pointer to structure.  Can't this be dealt with by adding correct edges
into the aliasing dag?
 
 so the above would at most matter for pointer-type fields of aggregates.
 And no, we don't want to deal with SCCs - I doubt it would bring any 
 benefit and the issue here is that I am worried about SCCs being present
 dependent on the order of streaming (well, the code pre-dates SCC
 streaming).
 
  Because we now stream in SCC order anyway, most of time we won't even
  need to populate it.  Shall I implement this?
 
 You can certainly try - but I think for cross-language interoperability
 we want to treat
 
   struct X { void *p; }
   struct Y { void *p; }
 
 and
 
   struct X { Y *p; }
   struct Y { X *q; } 
 
 as compatible so I don't think we can use SCCs to partition alias sets
 further (that is, we _do_ have to treat pointers conservatively).  The
 current code is as far as we can get IMHO (well, even the current code
 is too strict in making struct { void *p } and struct { int *p } not
 have the same alias-set.

I will give it a try.  Do you have some examples where the above matters
for cross-language TBAA?
 
  Other issue I run into is that for Ada bootstrap we have variadic type whose
  canonical types are having different temporary set as size.  I think this is
  valid and perhaps gimple_canonical_types_compatible_p should consider
  variadic arrays to be compatible with any array of compatible type?
 
 Those are all local types and thus the strict equality compare should be
 ok?  Not sure if we can do in C sth like
 
 void foo (int n)
 {
   struct { int i; int a[n]; } a;
   struct { int i; int a[n]; } b;
 }
 
 and if we'd have to assign the same alias-sets to 'a' and 'b'.

No idea here either. I wonder if the types are intedned to be TBAA compatible
across two calls of the function.  In that case we may introduce multiple copies
of body by early inlining already that may be a problem.
 
  I am not quite convinced we get variadic types right at LTO time, because
  they bypass canonical type calculation anyway and their canonical type
  is set by TYPE_MAIN_VARIANT in lto_read_body_or_constructor which I think
  is not safe.  I will look for a testcase.
 
 That is because if they are streamed locally they do not enter type
 merging, but they still go via gimple_register_canonical_type, so I'm not
 sure where you see they always get their main variant as canonical type.

I tought these are handled by:
  /* And fixup types we streamed locally.  */
{
  struct streamer_tree_cache_d *cache = data_in-reader_cache;
  unsigned len = cache-nodes.length ();
  unsigned i;
  for (i = len; i--  from;)
{
  tree t = streamer_tree_cache_get_tree (cache, i);
  if (t == NULL_TREE)
continue;

  if (TYPE_P (t))
{
  gcc_assert (TYPE_CANONICAL (t) == NULL_TREE);
  TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t);
  if (TYPE_MAIN_VARIANT (t) != t)
{
  gcc_assert (TYPE_NEXT_VARIANT (t) == NULL_TREE);
  TYPE_NEXT_VARIANT (t)
= TYPE_NEXT_VARIANT (TYPE_MAIN_VARIANT (t));
 

Re: [RFA] More type narrowing in match.pd V2

2015-05-18 Thread Marc Glisse

On Mon, 18 May 2015, Richard Biener wrote:


On Thu, May 14, 2015 at 4:28 PM, Jeff Law l...@redhat.com wrote:

On 05/14/2015 08:04 AM, Marc Glisse wrote:


On Fri, 1 May 2015, Jeff Law wrote:


Slight refactoring of the condition by using types_match as suggested
by Richi.  I also applied the new types_match to 2 other patterns in
match.pd where it seemed clearly appropriate.



I would like to propose this small tweak (regtested ok). If we had a
different type for trees and types, this would be overloading the
function. We already do this in a few places, and I find the resulting
shorter code more readable.

2015-05-14  Marc Glisse  marc.gli...@inria.fr

 * generic-match-head.c (types_match): Handle non-types.
 * gimple-match-head.c (types_match): Likewise.
 * match.pd: Remove unnecessary TREE_TYPE for types_match.


Update the comment for types_match and this is fine.


No - it breaks genmatch autodetection of what trees escape.  It special-cases
TREE_TYPE ().  See capture_info::walk_c_expr:


Oh, I didn't know that.


 /* Give up for C exprs mentioning captures not inside TREE_TYPE ().  */
 unsigned p_depth = 0;

so the patch will pessimize GENERIC folding.

Please do not apply.


I think it was already applied. Feel free to revert it, or tell me and 
I'll do it next week-end (from your next message I don't really know if it 
should be reverted or not).


--
Marc Glisse


Re: [RFA] More type narrowing in match.pd V2

2015-05-18 Thread Richard Biener
On Mon, May 18, 2015 at 2:34 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Mon, 18 May 2015, Richard Biener wrote:

 On Thu, May 14, 2015 at 4:28 PM, Jeff Law l...@redhat.com wrote:

 On 05/14/2015 08:04 AM, Marc Glisse wrote:


 On Fri, 1 May 2015, Jeff Law wrote:

 Slight refactoring of the condition by using types_match as suggested
 by Richi.  I also applied the new types_match to 2 other patterns in
 match.pd where it seemed clearly appropriate.



 I would like to propose this small tweak (regtested ok). If we had a
 different type for trees and types, this would be overloading the
 function. We already do this in a few places, and I find the resulting
 shorter code more readable.

 2015-05-14  Marc Glisse  marc.gli...@inria.fr

  * generic-match-head.c (types_match): Handle non-types.
  * gimple-match-head.c (types_match): Likewise.
  * match.pd: Remove unnecessary TREE_TYPE for types_match.


 Update the comment for types_match and this is fine.


 No - it breaks genmatch autodetection of what trees escape.  It
 special-cases
 TREE_TYPE ().  See capture_info::walk_c_expr:


 Oh, I didn't know that.

  /* Give up for C exprs mentioning captures not inside TREE_TYPE ().  */
  unsigned p_depth = 0;

 so the patch will pessimize GENERIC folding.

 Please do not apply.


 I think it was already applied. Feel free to revert it, or tell me and I'll
 do it next week-end (from your next message I don't really know if it should
 be reverted or not).

It's always good to compare generated code (gimple/generic-match.c in the
build directory) for such changes.  Only the obvious differences should appear.
If they do you can leave the patch in.

Richard.

 --
 Marc Glisse


Re: [patch 10/10] debug-early merge: compiler proper

2015-05-18 Thread Jan Hubicka
 On Fri, May 8, 2015 at 2:40 AM, Aldy Hernandez al...@redhat.com wrote:
  As seen on TV.
 
 +/* FIRST_TIME is set to TRUE for the first time we are called for a
 +   translation unit from finalize_compilation_unit() or false
 +   otherwise.  */
 +
  static void
 -analyze_functions (void)
 +analyze_functions (bool first_time)
  {
 ...
 +  if (first_time)
 +{
 +  symtab_node *snode;
 +  FOR_EACH_SYMBOL (snode)
 +   check_global_declaration (snode-decl);
 +}
 +
 
 I think it is better to split analyze_functions (why does it have it's own 
 copy
 of unreachable node removal?) into analysis and unused-symbol removal
 and have the
 check_global_declaration call be in finalize_compilation_unit directly.  
 Honza?

It is trying to avoid analyzing functions that are never used. Analyzing
include gimplification and other stuff that is not for free.
remove_unreachable_nodes works only on fully built cgraph.

Main reason why analyze_functions is intended to be called multiple times was
--combine support.  After each end of C source file you called
symbol_table::finalize_compilation_unit which did unreachable code removal and
saved some memory.

I am not sure symbol table is useful for debug info this way: we insert only
function definitions into symbol table and external declarations are inserted
lazilly.  That is why we still have the (convoluted) check_global_declarations
that walks the vectors provided by frontends.
 
 @@ -1113,6 +1131,19 @@ analyze_functions (void)
 {
   if (symtab-dump_file)
 fprintf (symtab-dump_file,  %s, node-name ());
 +
 + /* See if the debugger can use anything before the DECL
 +passes away.  Perhaps it can notice a DECL that is now a
 +constant and can tag the early DIE with an appropriate
 +attribute.
 +
 +Otherwise, this is the last chance the debug_hooks have
 +at looking at optimized away DECLs, since
 +late_global_decl will subsequently be called from the
 +contents of the now pruned symbol table.  */
 + if (!decl_function_context (node-decl))
 +   (*debug_hooks-late_global_decl) (node-decl);
 +
   node-remove ();
 
 so this applies to VAR_DECLs only - shouldn't this be in the
 varpool_node::remove function then?  You can even register/deregister
 a hook for this in finalize_compilation_unit.  That would IMHO be better.

There is varpool_node::remove_initializer that is intended to throw away the
constructor.  I suppose late_global_decl can be called from there?
 @@ -1334,6 +1334,9 @@ struct GTY(()) tree_block {
tree abstract_origin;
tree fragment_origin;
tree fragment_chain;
 +
 +  /* Pointer to the DWARF lexical block.  */
 +  struct die_struct *die;
  };
 
  struct GTY(()) tree_type_common {
 
 Ick - do we need this?  dwarf2out.c has a hashtable to map blocks to
 DIEs (which you don't remove in turn).

Note that types also have tree_type_symtab that in turn contains die pointer.

Honza
 
 Jason - did you intentionally not yet approve the dwarf2out.c
 changes (like, are you expecting somebody else
 to do a review)?
 
 Thanks,
 Richard.


Re: [match-and-simplify] fix incorrect code-gen in 'for' pattern

2015-05-18 Thread Prathamesh Kulkarni
On 18 May 2015 at 14:12, Richard Biener rguent...@suse.de wrote:
 On Sat, 16 May 2015, Prathamesh Kulkarni wrote:

 Hi,
 genmatch generates incorrect code for following (artificial) pattern:

 (for op (plus)
   op2 (op)
   (simplify
 (op @x @y)
 (op2 @x @y)

 generated gimple code: http://pastebin.com/h1uau9qB
 'op' is not replaced in the generated code on line 33:
 *res_code = op;

 I think it would be a better idea to make op2 iterate over same set
 of operators (op2-substitutes = op-substitutes).
 I have attached patch for the same.
 Bootstrap + testing in progress on x86_64-unknown-linux-gnu.
 OK for trunk after bootstrap+testing completes ?

 Hmm, but then the example could as well just use 'op'.  I think we
 should instead reject this.

 Consider

   (for op (plus minus)
 (for op2 (op)
   (simplify ...

 where it is not clear what would be desired.  Simple replacement
 of 'op's value would again just mean you could have used 'op' in
 the first place.  Doing what you propose would get you

   (for op (plus minus)
 (for op2 (plus minus)
   (simplify ...

 thus a different iteration.

 I wonder if we really need is_oper_list flag in user_id ?
 We can determine if user_id is an operator list
 if user_id::substitutes is not empty ?

 After your change yes.

 That will lose the ability to distinguish between user-defined operator
 list and list-iterator in for like op/op2, but I suppose we (so far) don't
 need to distinguish between them ?

 Well, your change would simply make each list-iterator a (temporary)
 user-defined operator list as well as the current iterator element
 (dependent on context - see the nested for example).  I think that
 adds to confusion.

 So - can you instead reject this use?
Well my intention was to have support for walking operator list in reverse.
For eg:
(for bitop (bit_and bit_ior)
  rbitop (bit_ior bit_and)
   ...)
Could be replaced by sth like:
(for bitop (bit_and bit_ior)
  rbitop (~bitop))
   ...)

where ~bitop would indicate walking (bit_and bit_ior) in reverse.
Would that be a good idea ? For symmetry, I thought
(for op (list)
  op2 (op))
should be supported too.


Thanks,
Prathamesh



 Thanks,
 Richard.


[PATCH][Testsuite] Disable tests with dg-require-fork for simulated targets

2015-05-18 Thread Alan Lawrence
Simulators such as qemu report the presence of fork (it's in glibc) but 
generally do not support synchronization primitives between threads, so any 
tests using fork are unreliable. This patch disables the subset of such tests 
that identify themselves using dg-require-fork.


At present, such tests are limited to (a) gcc.dg/torture/ftrapv-1.c. and (b) 
some tests in the 27_io section of the libstdc++ testsuite, listed below. 
Further patches can add dg-require-fork to the many other tests that call fork().


Cross-tested on aarch64-none-linux-gnu using qemu, with these tests becoming 
UNSUPPORTED:


(gcc)
gcc.dg/torture/ftrapv-1.c

(libstdc++)
27_io/basic_filebuf/close/char/4879.cc
27_io/basic_filebuf/close/char/9964.cc
27_io/basic_filebuf/seekoff/char/26777.cc
27_io/basic_filebuf/showmanyc/char/9533-1.cc
27_io/basic_filebuf/underflow/char/10097.cc
27_io/objects/char/7.cc
27_io/objects/char/9661-1.cc
27_io/objects/wchar_t/7.cc
27_io/objects/wchar_t/9661-1.cc

Thoughts? Is this patch OK for trunk?

Cheers, Alan

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_fork_available): Return 0 if running
on simulator.
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index f632d00..0ffd35c 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1781,6 +1781,10 @@ proc check_function_available { function } {
 # Returns true iff fork is available on the target system.
 
 proc check_fork_available {} {
+if { [check_effective_target_simulator] } {
+   # Most simulators do not properly support multithreading
+   return 0
+}
 return [check_function_available fork]
 }
 


Re: [PATCH][calls.c] Remove #ifdef checks on STACK_GROWS_DOWNWARD

2015-05-18 Thread Jeff Law

On 05/18/2015 03:22 AM, Kyrill Tkachov wrote:

Hi all,

As per https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01166.html
I saw that STACK_GROWS_DOWNWARD is being checked for ifdef in a number
of places
unrelated to the patch in the PR, so decided to remove the ifdef checks
for STACK_GROWS_DOWNWARD
in a separate patch. It's fairly mechanical.

Bootstrapped on x86_64 and aarch64 and tested on arm.
I believe these patches are considered obvious, so I'll commit this in a
day, unless someone objects,
before proceeding with the changes in
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01166.html

2015-05-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * calls.c: Always define STACK_GROWS_DOWNWARD as 0 or 1.
 (mem_overlaps_already_clobbered_arg_p): Rewrite ifdef
 STACK_GROWS_DOWNWARD as normal if.
 (expand_call): Likewise.

Thanks for taking care of this...

jeff


RE: [PATCH, MIPS]: Fix internal compiler error: in check_bool_attrs, at recog.c:2218 for micromips attribute

2015-05-18 Thread Robert Suchanek
Hi Matthew,

  This patch fixes an internal compiler error when micromips/nomicromips
  attributes are used.
 
  The problem here was that the cached boolean attributes for the current
  target did not agree with the uncached attributes throwing an assertion
  error.
 
  It appears that saving and restoring the state for micromips was
  missing, just like there is for mips16. OK to apply?
 
 Just to check, the reason we don't see this in the current testsuite is that
 there is no test with both MIPS and microMIPS functions?

Correct.  The testsuite either tests microMIPS or MIPS but not both in the same 
TU.

  gcc/
  * config/mips/mips.c (micromips_globals): New variable.
  (mips_set_compression_mode): Save and reinitialize target-dependent
  state for microMIPS.
 
  gcc/testsuite
  * gcc.target/mips/umips-attr.c: New test.
 
 OK.

Committed as r223294.

Regards,
Robert


Re: [PATCH] [AArch32] Additional bics patterns.

2015-05-18 Thread Alex Velenko

Committed r223295.

Alex.



Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread H.J. Lu
On Mon, May 18, 2015 at 6:13 AM, Michael Matz m...@suse.de wrote:
 Hi,

 On Mon, 18 May 2015, H.J. Lu wrote:

 To avoid indirect branch to internal functions, I am proposing to add a
 new relocation, R_X86_64_RELAX_GOTPCREL, to x86-64 psABI:

 1. When branching to an external function, foo, compiler may generate
   call/jmp *foo@GOTRELAX(%rip)
which generates R_X86_64_RELAX_GOTPCREL relocation, instead of
   call/jmp foo[@PLT]
 2. When function foo is locally defined, linker converts
   call/jmp *foo@GOTRELAX(%rip)
to
   nop call/jmp foo

 For the jmp case the nop can also be added after it, to not even disturb

Yes, we should convert it to

nop call foo/jmp foo nop

I implemented it on users/hjl/relax branch in binutils git repo.

 the insn decoder.  For calls as well of course, but there it might be
 better to have it before the call.


I think a nop prefix is better on call.  We won't  mandate nop call foo
in psABI and linker is free to use either a nop prefix or a nop suffix.

Should we move forward with it?

Thanks.

-- 
H.J.


Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread Michael Matz
Hi,

On Mon, 18 May 2015, H.J. Lu wrote:

 To avoid indirect branch to internal functions, I am proposing to add a
 new relocation, R_X86_64_RELAX_GOTPCREL, to x86-64 psABI:
 
 1. When branching to an external function, foo, compiler may generate
   call/jmp *foo@GOTRELAX(%rip)
which generates R_X86_64_RELAX_GOTPCREL relocation, instead of
   call/jmp foo[@PLT]
 2. When function foo is locally defined, linker converts
   call/jmp *foo@GOTRELAX(%rip)
to
   nop call/jmp foo

For the jmp case the nop can also be added after it, to not even disturb 
the insn decoder.  For calls as well of course, but there it might be 
better to have it before the call.


Ciao,
Michael.


Re: [PING][PR65637] Fix ssa-handling code in expand_omp_for_static_chunk

2015-05-18 Thread Tom de Vries

On 18-05-15 14:53, Tom de Vries wrote:

On 15-04-15 15:10, Tom de Vries wrote:

Hi,

This patch series fixes PR65637.

Currently, ssa-handling code in expand_omp_for_static_chunk is dead and not
exercised by testing.

Ssa-handling code in omp-low.c is only triggered by pass_parallelize_loops, and
that pass doesn't specify a chunk size on the GIMPLE_OMP_FOR it constructs, so
that only exercises the expand_omp_for_static_nochunk path.

Using the attached trigger patch, we excercise the ssa-handling code in
expand_omp_for_static_chunk. The following patch series fixes the problems in
the ssa-handling code that we encounter.

1. Fix gcc_assert in expand_omp_for_static_chunk
2. Fix inner loop phi in expand_omp_for_static_chunk
3. Handle 2 preds for fin_bb in expand_omp_for_static_chunk

The patch series has been bootstrapped and reg-tested on x86_64 together with
attached trigger patch.

I'll post the patches from the patch series individually, in response to this
email.



Ping for the three patches.



Committed to gomp-4_0-branch.


Thanks,
- Tom


0001-Set-chunk_size-to-one-for-parloops-parallel.patch


Set chunk_size to one for parloops parallel

---
  gcc/tree-parloops.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 62a6444..862c420 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1719,6 +1719,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn,
tree data,
type = TREE_TYPE (cvar);
t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+  OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t) = integer_one_node;

for_stmt = gimple_build_omp_for (NULL, GF_OMP_FOR_KIND_FOR, t, 1, NULL);
gimple_set_location (for_stmt, loc);
-- 1.9.1







RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread H.J. Lu
To avoid indirect branch to internal functions, I am proposing to add a
new relocation, R_X86_64_RELAX_GOTPCREL, to x86-64 psABI:

1. When branching to an external function, foo, compiler may generate
call/jmp *foo@GOTRELAX(%rip)
   which generates R_X86_64_RELAX_GOTPCREL relocation, instead of
call/jmp foo[@PLT]
2. When function foo is locally defined, linker converts
call/jmp *foo@GOTRELAX(%rip)
   to
nop call/jmp foo
3. Otherwise, linker treats R_X86_64_RELAX_GOTPCREL the same way as
   R_X86_64_GOTPCREL.

For i386 psABI, we add R_386_RELAX_GOT32:

1. When branching to an external function, foo, in non-PIC mode,
   compiler may generate
call/jmp *foo@GOTRELAX
   which generates R_386_RELAX_GOT32 relocation, instead of
call/jmp foo
   and in PIC mode
call/jmp *foo@GOTRELAX(%reg)
   which generates R_386_RELAX_GOT32 relocation and REG holds the address
   of GOT, instead of
call/jmp foo@PLT
2. When function foo is locally defined, linker converts
call/jmp *foo@GOTRELAX[(%reg)]
   to
nop call/jmp foo
3. Otherwise, 
   a. In PIC mode, linker treats R_386_RELAX_GOT32 the same way as
  R_386_GOT32 and call/jmp *foo@GOTRELAX is unsupported.
   b. In no-PIC mode, linker computes its relocation value as relocation
  value of R_386_GOT32 plus the address of GOT and converts
call/jmp *foo@GOTRELAX(%reg)
  to
call/jmp *foo@GOTRELAX
  if needed.

This new relocation effectively turns off lazy binding on function, foo.

For i386, compiler is free to choose any register to hold the address of
GOT and there is no need to make EBX a fixed register when branching to
an external function in PIC mode.

With this new relocation, only a one-byte NOP prefix overhead is added
when function, foo, which compiler determines is external, turns out to
be local at link-time, because of -Bsymbolic or a definition in another
input object file which compiler has no knowledge of.

The new -fno-plt GCC option can use R_X86_64_RELAX_GOTPCREL and
R_386_RELAX_GOT32 relocations if linker supports them to avoid indirect
branch to internal functions.


H.J.


[PING][PR65637] Fix ssa-handling code in expand_omp_for_static_chunk

2015-05-18 Thread Tom de Vries

On 15-04-15 15:10, Tom de Vries wrote:

Hi,

This patch series fixes PR65637.

Currently, ssa-handling code in expand_omp_for_static_chunk is dead and not
exercised by testing.

Ssa-handling code in omp-low.c is only triggered by pass_parallelize_loops, and
that pass doesn't specify a chunk size on the GIMPLE_OMP_FOR it constructs, so
that only exercises the expand_omp_for_static_nochunk path.

Using the attached trigger patch, we excercise the ssa-handling code in
expand_omp_for_static_chunk. The following patch series fixes the problems in
the ssa-handling code that we encounter.

1. Fix gcc_assert in expand_omp_for_static_chunk
2. Fix inner loop phi in expand_omp_for_static_chunk
3. Handle 2 preds for fin_bb in expand_omp_for_static_chunk

The patch series has been bootstrapped and reg-tested on x86_64 together with
attached trigger patch.

I'll post the patches from the patch series individually, in response to this
email.



Ping for the three patches.

Thanks,
- Tom


0001-Set-chunk_size-to-one-for-parloops-parallel.patch


Set chunk_size to one for parloops parallel

---
  gcc/tree-parloops.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 62a6444..862c420 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1719,6 +1719,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, 
tree data,
type = TREE_TYPE (cvar);
t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+  OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t) = integer_one_node;

for_stmt = gimple_build_omp_for (NULL, GF_OMP_FOR_KIND_FOR, t, 1, NULL);
gimple_set_location (for_stmt, loc);
-- 1.9.1





[PATCH 01/13] recog: Increased max number of alternatives - v2

2015-05-18 Thread Andreas Krebbel
The new version also changes the type for the alternative_mask to unsigned 
HOST_WIDE_INT.

Bootstrapped without regressions on x86-64.

Ok to apply?

Bye,

-Andreas-

gcc/
* recog.h: Increase MAX_RECOG_ALTERNATIVES.
---
 gcc/recog.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 463c748..07700bd 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
bit giving an invalid value that can be used to mean uninitialized.  */
-#define MAX_RECOG_ALTERNATIVES 30
-typedef unsigned int alternative_mask;
+#define MAX_RECOG_ALTERNATIVES 35
+typedef unsigned HOST_WIDE_INT alternative_mask;

 /* A mask of all alternatives.  */
 #define ALL_ALTERNATIVES ((alternative_mask) -1)



Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-05-18 Thread Ramana Radhakrishnan
On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès l...@gnu.org wrote:
 As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65711.

 Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)


OK for trunk. This is also ok for all release branches if no
objections in 24 hours.

Sorry about the delayed review.

regards
Ramana

 Tested on armhf-linux-gnu (armv7.)

 gcc/
 2015-04-23  Ludovic Courtès  l...@gnu.org

 PR 65711
 * config/arm/linux-elf.h (LINUX_TARGET_LINK_SPEC): Move
 '-dynamic-linker' within %{!shared: ...}.



Re: [PATCH diagnostics/fortran] Handle two locations for the same diagnostic. Convert all gfc_warning_1 and gfc_notify_std_1 calls

2015-05-18 Thread Dodji Seketeli
Manuel López-Ibáñez lopeziba...@gmail.com writes:

 On 15 May 2015 at 10:39, Dodji Seketeli do...@redhat.com wrote:
 Manuel López-Ibáñez lopeziba...@gmail.com writes:
 -/* Expand the location of this diagnostic. Use this function for 
 consistency. */
 +/* Return the location associated to this diagnostic. WHICH specifies

 Here, I think only the 'W' (in WHICH) should be uppercase.

 I'm following the convention that parameter names are uppercase in the
 description of functions.

Oh, okay then.  My bad.  Sorry.

  /* The type of a text to be formatted according a format specification
 along with a list of things.  */
  struct text_info
  {
 +public:

 As this is a struct, the 'public' here is not necessary, as the members
 are public by default.

 I have a very poor memory for such details ;), since we are using
 'private:' already, does it really hurt to be explicit and use
 'public:' here?

It doesn't hurt, per se.  But I think it's a very common style to avoid
specifying the 'public' in this case, so I'd rather just remove it, yes.

[...]

OK to commit with this change then.

Cheers,

-- 
Dodji


Re: [PATCH] [AArch32] Additional bics patterns.

2015-05-18 Thread Kyrill Tkachov

Hi Alex,

On 15/05/15 18:53, Alex Velenko wrote:

On 01/05/15 10:28, Kyrill Tkachov wrote:

Can you please confirm that bootstraps with both arm and thumb pass?
That is, configured with --with-mode=arm and --with-mode=thumb


Hi Kyrill,

Bootstrapped on arm-none-gnueabihf with arm and thumb mode.

Following patch requires bics shift operand on Thumb2 to be
const int, as bics shifted by register is not supported by
Thumb2.

Is patch ok?

gcc

2015-05-15  Alex Velenko  alex.vele...@arm.com

 * config/arm/arm.md (andsi_not_shiftsi_si_scc): New pattern.
 * (andsi_not_shiftsi_si_scc_no_reuse): New pattern.


No need for the '*', just:

* config/arm/arm.md (andsi_not_shiftsi_si_scc): New pattern.
(andsi_not_shiftsi_si_scc_no_reuse): Likewise.


will do.
Ok for trunk with that ChangeLog nit.

Thanks,
Kyrill



gcc/testsuite

2015-05-15  Alex Velenko alex.vele...@arm.com

 * gcc.target/arm/bics_1.c : New testcase.
 * gcc.target/arm/bics_2.c : New testcase.
 * gcc.target/arm/bics_3.c : New testcase.
 * gcc.target/arm/bics_4.c : New testcase.
---
gcc/config/arm/arm.md | 49 ++
gcc/testsuite/gcc.target/arm/bics_1.c | 54
+
gcc/testsuite/gcc.target/arm/bics_2.c | 57
+++
gcc/testsuite/gcc.target/arm/bics_3.c | 41 +
gcc/testsuite/gcc.target/arm/bics_4.c | 49 ++
5 files changed, 250 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/arm/bics_1.c
create mode 100644 gcc/testsuite/gcc.target/arm/bics_2.c
create mode 100644 gcc/testsuite/gcc.target/arm/bics_3.c
create mode 100644 gcc/testsuite/gcc.target/arm/bics_4.c

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..26d3ad2 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2768,6 +2768,55 @@
(const_string logic_shift_reg)))]
)

+;; Shifted bics pattern used to set up CC status register and not reusing
+;; bics output.  Pattern restricts Thumb2 shift operand as bics for Thumb2
+;; does not support shift by register.
+(define_insn andsi_not_shiftsi_si_scc_no_reuse
+  [(set (reg:CC_NOOV CC_REGNUM)
+   (compare:CC_NOOV
+   (and:SI (not:SI (match_operator:SI 0 shift_operator
+   [(match_operand:SI 1 s_register_operand r)
+(match_operand:SI 2 arm_rhs_operand rM)]))
+   (match_operand:SI 3 s_register_operand r))
+   (const_int 0)))
+   (clobber (match_scratch:SI 4 =3Dr))]
+  TARGET_ARM || (TARGET_THUMB2  CONST_INT_P (operands[2]))
+  bic%.%?\\t%4, %3, %1%S0
+  [(set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr conds set)
+   (set_attr shift 1)
+   (set (attr type) (if_then_else (match_operand 2
const_int_operand )
+ (const_string logic_shift_imm)
+ (const_string logic_shift_reg)))]
+)
+
+;; Same as andsi_not_shiftsi_si_scc_no_reuse, but the bics result is also
+;; getting reused later.
+(define_insn andsi_not_shiftsi_si_scc
+  [(parallel [(set (reg:CC_NOOV CC_REGNUM)
+   (compare:CC_NOOV
+   (and:SI (not:SI (match_operator:SI 0 shift_operator
+   [(match_operand:SI 1 s_register_operand r)
+(match_operand:SI 2 arm_rhs_operand rM)]))
+   (match_operand:SI 3 s_register_operand r))
+   (const_int 0)))
+   (set (match_operand:SI 4 s_register_operand =3Dr)
+(and:SI (not:SI (match_op_dup 0
+[(match_dup 1)
+ (match_dup 2)]))
+(match_dup 3)))])]
+  TARGET_ARM || (TARGET_THUMB2  CONST_INT_P (operands[2]))
+  bic%.%?\\t%4, %3, %1%S0
+  [(set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr conds set)
+   (set_attr shift 1)
+   (set (attr type) (if_then_else (match_operand 2
const_int_operand )
+ (const_string logic_shift_imm)
+ (const_string logic_shift_reg)))]
+)
+
(define_insn *andsi_notsi_si_compare0
  [(set (reg:CC_NOOV CC_REGNUM)
  (compare:CC_NOOV
diff --git a/gcc/testsuite/gcc.target/arm/bics_1.c
b/gcc/testsuite/gcc.target/arm/bics_1.c
new file mode 100644
index 000..173eb89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/bics_1.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options -O2 --save-temps -fno-inline } */
+/* { dg-require-effective-target arm32 } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d =3D a  ~b;
+
+  /* { dg-final { scan-assembler-times bics\tr\[0-9\]+, r\[0-9\]+,
r\[0-9\]+ 2 } } */
+  if (d =3D=3D 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  int d =3D a  ~(b  3);
+
+  /* { dg-final { scan-assembler-times bics\tr\[0-9\]+, r\[0-9\]+,
r\[0-9\]+, .sl \#3 1 } } 

RE: [PATCH] [gcc, combine] Backport to GCC 5.0 branch PR46164: Don't combine the insns if a volatile register is contained.

2015-05-18 Thread Hale Wang

 -Original Message-
 From: Segher Boessenkool [mailto:seg...@kernel.crashing.org]
 Sent: Thursday, May 14, 2015 9:29 PM
 To: Hale Wang
 Cc: l...@redhat.com; GCC Patches; Richard Sandiford; 'Terry Guo'
 Subject: Re: [PATCH] [gcc, combine] Backport to GCC 5.0 branch PR46164:
 Don't combine the insns if a volatile register is contained.
 
 On Thu, May 14, 2015 at 01:56:54PM +0800, Hale Wang wrote:
gcc/ChangeLog:
2015-04-22 Hale Wang hale.w...@arm.com
Terry Guo  terry@arm.com
   
   PR rtl-optimization/64818
   * combine.c (can_combine_p): Don't combine user-specified
register
   if
   it is in an asm input.
   
gcc/testsuite/ChangeLog:
2015-04-22 Hale Wang hale.w...@arm.com
Terry Guo  terry@arm.com
   
   PR rtl-optimization/64818
   * gcc.target/arm/pr64818.c: New.
 
  This patch applies cleanly on GCC 5.0 branch. Bootstrap and regression
test
 are OK for X86_64.
 
  Can we backport this patch to GCC 5.0 branch?
 
 It should be perfectly safe, and it's a pretty nasty bug.  But it is
technically not
 a regression (or is it?), so I'll defer to the release managers.
 

Yes, I agree it is not a regression.
 
 Segher





Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value to output_constant_pool_2

2015-05-18 Thread Richard Biener
On Fri, 15 May 2015, Jeff Law wrote:

 On 05/15/2015 04:37 AM, Dharmakan Rohit Arul Raj wrote:
  
   -Original Message-
   From: Jeff Law [mailto:l...@redhat.com]
   Sent: Friday, May 15, 2015 10:30 AM
Just to summarize: By default in GCC v4.7.x, all the constants are put
into '.rodata.str1.4' section. In GCC v4.8.x from r192719 onwards, one
of the move instruction of the string constant .LC0 is getting
spilled. The reload pass, for any constants that aren't allowed and
can't be reloaded in to registers tries to change them into memory
references. Then while emitting that string constant to asm code
(A:varasm.c: output_constant_pool_1), it explicitly passes the
alignment as 1 which prevents the generation of fix-up table entries
in  'B: rs6000.c:rs6000_assemble_integer' because the data is
considered unaligned now.

The bug seems to have gone latent with an unrelated trunk commit
r204695 [* tree-ssa-loop-ivopts.c (force_expr_to_var_cost): Refactor
the code. Handle type conversion.]. This commit chooses different
spill candidates hence all the string constants are being put in to
'.rodata.str1.4´section.

The check I had in the test case is that if there is a
'.data.rel.ro.local', then there should be '.fixup' section generated.

Please let me know if you need any other details.
   Thanks.  Even though I wasn't able to trigger the bug with the testcase
   from
   65018, I went ahead and committed this patch to the trunk.  It can't hurt
   and
   it's the right thing to do.
   
   Thanks for your patience,
   
  
  Jeff, Thanks for checking-in the changes.
  Can we apply the patch to GCC v4.8 and GCC v4.9 branch as well?
 That's up to the branch maintainers.  I'd think it's safe, but it's ultimately
 their call.

If it's a regression or a wrong-code issue then it's ok to backport given
the patch is safe, of course.

Richard.

 jeff
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Re: Check canonical types in verify_type

2015-05-18 Thread Richard Biener
On Mon, 18 May 2015, Jan Hubicka wrote:

 Hi,
 this patch adds basic checking of TYPE_CANOINCAL. It checkes TYPE_CANONICAL
 forms a tree and it moves lto's gimple_canonical_types_compatible_p back to
 middle-end and uses it to check that TYPE_CANONICAL is compatible and thus 
 none
 of the FEs produce types sharing TYPE_CANONICAL that would be considered
 different by LTO's TBAA.
 
 I added trust_type_canonical argument and changed:
 
 -  /* If the types have been previously registered and found equal
 - they still are.  */
 -  if (TYPE_CANONICAL (t1)
 -   TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2))
 -return true;
 
 to:
 
 +  /* If the types have been previously registered and found equal
 + they still are.  */
 +  if (TYPE_CANONICAL (t1)  TYPE_CANONICAL (t2)
 +   trust_type_canonical)
 +return TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2);
 
 (notice that I blocked the recursion when TYPE_CANONICAL is known to be
 different).
 
 The patch originally triggered an ICE in testsuite becuase C++ FE builds
 FUNCTION_TYPE type variant with different attributes but same TYPE_CANONICAL.
 I think C++ FE is correct and we may want to drop:
 
 +  if (!comp_type_attributes (t1, t2))
 + return false;
 
 Because I think the TBAA does not care about attribute lists.  I suppose this
 is kind-of harmless because of:

I think that was about attributes like 'packed', but those should have
their effects applied at stor-layout time.  So yes, I think this can be
dropped.

 +  /* For canonical type comparisons we do not want to build SCCs
 +  so we cannot compare pointed-to types.  But we can, for now,
 +  require the same pointed-to type kind and match what
 +  useless_type_conversion_p would do.  */
 +  if (POINTER_TYPE_P (t1))
 + {
 +   if (TYPE_ADDR_SPACE (TREE_TYPE (t1))
 +   != TYPE_ADDR_SPACE (TREE_TYPE (t2)))
 + return false;
 +
 +   if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2)))
 + return false;
 + }
 
 Is the reason for not making differences between differnt pointer types
 really just lazyness to not deal with SCCs?  For that it is easy to add
 set of visited type pairs, just like odr_types_equivalent_p does.

Well - TBAA doesn't care about pointed-to types.  See get_alias_set

  else if (POINTER_TYPE_P (t)
t != ptr_type_node)
set = get_alias_set (ptr_type_node);

so the above would at most matter for pointer-type fields of aggregates.
And no, we don't want to deal with SCCs - I doubt it would bring any 
benefit and the issue here is that I am worried about SCCs being present
dependent on the order of streaming (well, the code pre-dates SCC
streaming).

 Because we now stream in SCC order anyway, most of time we won't even
 need to populate it.  Shall I implement this?

You can certainly try - but I think for cross-language interoperability
we want to treat

  struct X { void *p; }
  struct Y { void *p; }

and

  struct X { Y *p; }
  struct Y { X *q; } 

as compatible so I don't think we can use SCCs to partition alias sets
further (that is, we _do_ have to treat pointers conservatively).  The
current code is as far as we can get IMHO (well, even the current code
is too strict in making struct { void *p } and struct { int *p } not
have the same alias-set.

 Other issue I run into is that for Ada bootstrap we have variadic type whose
 canonical types are having different temporary set as size.  I think this is
 valid and perhaps gimple_canonical_types_compatible_p should consider
 variadic arrays to be compatible with any array of compatible type?

Those are all local types and thus the strict equality compare should be
ok?  Not sure if we can do in C sth like

void foo (int n)
{
  struct { int i; int a[n]; } a;
  struct { int i; int a[n]; } b;
}

and if we'd have to assign the same alias-sets to 'a' and 'b'.

 I am not quite convinced we get variadic types right at LTO time, because
 they bypass canonical type calculation anyway and their canonical type
 is set by TYPE_MAIN_VARIANT in lto_read_body_or_constructor which I think
 is not safe.  I will look for a testcase.

That is because if they are streamed locally they do not enter type
merging, but they still go via gimple_register_canonical_type, so I'm not
sure where you see they always get their main variant as canonical type.

 I also added check that TYPE_CANONICAL agrees for type variants.  I think it
 should because few times in the middle-end uses TYPE_MAIN_VARIANT and seem to
 expect this to happen:

Yes.  Otherwise we'd run in circles for code that does

 type = TYPE_CANONICAL (TYPE_MAIN_VARIANT (type));

because the result might not be a main variant.  That's something to
verify (both TYPE_CANONICAL and TYPE_MAIN_VARIANT should form a tree). 

 static inline int 
   
 same_type_for_tbaa (tree type1, tree type2)   
   
 {   

Re: [PATCH, 5/5] check_GNU_style.sh: Fix tab size in 80 characters check

2015-05-18 Thread Tom de Vries

On 12-05-15 17:16, Jeff Law wrote:

[PATCH 5/5] check_GNU_style.sh: Fix tab size in 80 characters check

2015-05-11  Tom de Vriest...@codesourcery.com

* check_GNU_style.sh (col): Fix tab size.

OK.


Hi Jeff,

I.

I noticed a performance degradation due to this patch:
...
$ cat gcc/tree-ssa-tail-merge.c | awk '{printf +%s\n, $0}' | time -p 
./contrib/check_GNU_style.sh -

  ...
real 4.10
user 0.71
sys 6.77
...

Before this patch, the 'real' time was roughly a factor 80 smaller:
...
real 0.05
user 0.02
sys 0.03
...

This degradation is due to the fact that the patch does the 80 chars check 
line-by-line, and invokes processes for each new line.



II.

Attached follow-up patch rewrites the 80 chars check to handle a file at a time 
rather than a line at a time, and gets performance back to normal:

...
real 0.07
user 0.03
sys 0.05
...

As a bonus, the bit longer than 80 chars is now printed in red, similar to how 
the other checks show the output.


OK for trunk?

Thanks,
- Tom
check_GNU_style.sh: Don't do 80 char check line by line

2015-05-18  Tom de Vries  t...@codesourcery.com

	* check_GNU_style.sh: Add temp files tmp2 and tmp3.
	(cat_with_prefix): New function, using global variable prefix.
	(col): Make prefix a global variable. Rewrite to process file at a time
	rather than line at a time.  Print part longer than 80 chars in red.
---
 contrib/check_GNU_style.sh | 70 ++
 1 file changed, 52 insertions(+), 18 deletions(-)

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index ab59b1e..033a2c9 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -65,10 +65,12 @@ fi
 
 inp=check_GNU_style.inp
 tmp=check_GNU_style.tmp
+tmp2=check_GNU_style.2.tmp
+tmp3=check_GNU_style.3.tmp
 
 # Remove $tmp on exit and various signals.
-trap rm -f $inp $tmp $stdin_tmp 0
-trap rm -f $inp $tmp $stdin_tmp; exit 1 1 2 3 5 9 13 15
+trap rm -f $inp $tmp $tmp2 $tmp3 $stdin_tmp 0
+trap rm -f $inp $tmp $tmp2 $tmp3 $stdin_tmp; exit 1 1 2 3 5 9 13 15
 
 if [ $nfiles -eq 1 ]; then
 # There's no need for the file prefix if we're dealing only with one file.
@@ -80,6 +82,17 @@ grep $format '^+' $files \
 | grep -v ':+++' \
  $inp
 
+cat_with_prefix ()
+{
+local f=$1
+
+if [ $prefix =  ]; then
+	cat $f
+else
+	awk {printf %s%s\n, $prefix, \$0} $f
+fi
+}
+
 # Grep
 g (){
 local msg=$1
@@ -134,10 +147,11 @@ vg (){
 
 col (){
 local msg=$1
+
 local first=true
 local f
 for f in $files; do
-	local prefix=
+	prefix=
 	if [ $nfiles -ne 1 ]; then
 	prefix=$f:
 	fi
@@ -148,22 +162,42 @@ col (){
 	| grep -v ':+++' \
 	 $tmp
 
-	cat $tmp | while IFS= read -r line; do
-	local longline
-	# Filter out the line number prefix and the patch line modifier '+'
-	# to obtain the bare line, before we use expand.
-	longline=$(echo $line \
-		| sed 's/^[0-9]*:+//' \
-		| expand \
-		| awk '{ if (length($0)  80) print $0}')
-	if [ $longline !=  ]; then
-		if $first; then
-		printf \n$msg\n
-		first=false
-		fi
-		echo $prefix$line
+	# Keep only line number prefix and patch modifier '+'.
+	cat $tmp \
+	| sed 's/\(^[0-9][0-9]*:+\).*/\1/' \
+	 $tmp2
+
+	# Remove line number prefix and patch modifier '+'.
+	# Expand tabs to spaces according to tab positions.
+	# Keep long lines, make short lines empty.  Print the part past 80 chars
+	# in red.
+	cat $tmp \
+	| sed 's/^[0-9]*:+//' \
+	| expand \
+	| awk '{ \
+		 if (length($0)  80) \
+		   printf %s\033[1;31m%s\033[0m\n, \
+			  substr($0,1,80), \
+			  substr($0,81); \
+		 else \
+		   print  \
+		   }' \
+	 $tmp3
+
+	# Combine prefix back with long lines.
+	# Filter out empty lines.
+	local found=false
+	paste -d '' $tmp2 $tmp3 \
+	| grep -v '^[0-9][0-9]*:+$' \
+	 $tmp  found=true
+
+	if $found; then
+	if $first; then
+		printf \n$msg\n
+		first=false
 	fi
-	done
+	cat_with_prefix $tmp
+	fi
 done
 }
 
-- 
1.9.1



RE: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64

2015-05-18 Thread David Sherwood
Hi Jeff,

Thanks for the suggestion. I did a bootstrap x86_64 build before and after my
patch and looked for differences in the last stage object files and there were
plenty of them. I chose a nice simple function (check_callers) from
ipa-inline-analysis.c and reduced it to a small test case. Hope this is ok.

Testing done:

 *  aarch64 built, make check no regressions
 *  aarch64_be built, make check no regressions
 *  x86_64 built, make check no regressions

ChangeLog:
2015-05-15  David Sherwood  david.sherw...@arm.com

* loop-invariant.c (create_new_invariant): Don't calculate address cost
if mode is not a scalar integer.
 (get_inv_cost): Increase computational cost for unused invariants.
* testsuite/gcc.dg/loop-invariant.c: New testcase.

Regards,
David.

 -Original Message-
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: 11 May 2015 20:27
 To: David Sherwood; gcc-patches@gcc.gnu.org
 Subject: Re: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop 
 invariant motion
discovered by
 spec2k6 on aarch64
 
 On 05/11/2015 06:02 AM, David Sherwood wrote:
  Hi,
 
  This patch fixes a couple of issues I saw during the compilation of 
  shell_lam.f
  for 410.bwaves test in spec2006:
 
  * create_new_invariant: We shouldn't bother attempting to calculate the 
  address
  cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P 
  check
  to eliminate the most obvious cases, such as vector modes and so on.
  * get_inv_cost: Change the code so that we don't treat something as an 
  address
  if it is never actually used as an address, i.e. we didn't use it at all. 
  This
  occurs due to the way we process the innermost loop first - invariants get
  pulled out of inner most loop and then get processed during next innermost 
  loop.
  Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
  shows what currently happens:
 
  4427 *ending processing of loop 27 **
 
  4442 (const_vector:V2DF [
  4443 (const_double:DF 4.0e+0 [0x0.8p+3])
   (const_double:DF 4.0e+0 [0x0.8p+3])
  4445 ])
  4446
  4447 Hot cost: 8 (final)
  4448 (set (reg:V2DF 3179)
  4449 (const_vector:V2DF [
  4450 (const_double:DF 4.0e+0 [0x0.8p+3])
  4451 (const_double:DF 4.0e+0 [0x0.8p+3])
  4452 ]))
  4453
  4454 Hot cost: 8 (final)
  4455 Set in insn 2764 is invariant (1), cost 8, depends on
 
  4471 Decided to move invariant 1 -- gain 8
 
  move_invariant_reg moves invariant (the const_vector above) into 3490 and
  replaces use of 3179 with 3490, but conservatively keeps the definition of 
  3179,
  which gets processed in outer loop, but is never used.
 
  Debug for outer loop now follows:
 
  4497 *starting processing of loop 26 **
 
  6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
  6380 (const_vector:V2DF [
  6381 (const_double:DF 4.0e+0 [0x0.8p+3])
  6382 (const_double:DF 4.0e+0 [0x0.8p+3])
  6383 ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
  6384  (nil))
  6385 ;;   UD chains for insn luid 88 uid 2766
 
  6604 ;;   UD chains for insn luid 21 uid 3836
  6605 ;;  reg 3490 { d419(bb 110 insn 2764) }
  6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
  6607 (reg:V2DF 3490)) shell_lam.f:287 -1
  6608  (nil))
  ^^^ additional insn generated by move_invariant_reg from loop 27
 
  6836 ;;   UD chains for insn luid 40 uid 2790
  6837 ;;  reg 3161 { d272(bb 111 insn 2746) }
  6838 ;;  reg 3201 { d305(bb 111 insn 2788) }
  6839 ;;  reg 3490 { d419(bb 110 insn 2764) }
  6840 ;;   eq_note reg 3161 { d272(bb 111 insn 2746) }
  6841 ;;   eq_note reg 3201 { d305(bb 111 insn 2788) }
  6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
  6843 (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, 
  index: ivtmp.679_3214,
  offset: 0B] ])
  6844 (neg:V2DF (reg:V2DF 3490))
  6845 (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 
  1219 {fnmav2df4}
  6846  (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
  6847 (expr_list:REG_DEAD (reg:V2DF 3179) ...
  ^^^ 3179 marked as DEAD
 
  8366 *ending processing of loop 26 **
 
  8472 (const_vector:V2DF [
  8473 (const_double:DF 4.0e+0 [0x0.8p+3])
  8474 (const_double:DF 4.0e+0 [0x0.8p+3])
  8475 ])
  8476
  8477 Hot cost: 8 (final)
  8478 (set (reg:V2DF 3490)
  8479 (const_vector:V2DF [
  8480 (const_double:DF 4.0e+0 [0x0.8p+3])
  8481 (const_double:DF 4.0e+0 [0x0.8p+3])
  8482 ]))
  8483
  8484 Hot cost: 8 (final)
  8485 Set in insn 2764 is invariant (12), cost 8, depends on
 
  8505 (set (reg:V2DF 3179)
 

Re: check_GNU_style.sh: Don't cat empty file

2015-05-18 Thread Andreas Schwab
Tom de Vries tom_devr...@mentor.com writes:

 diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
 index 728c11a..ab59b1e 100755
 --- a/contrib/check_GNU_style.sh
 +++ b/contrib/check_GNU_style.sh
 @@ -84,10 +84,16 @@ grep $format '^+' $files \
  g (){
  local msg=$1
  local arg=$2
 +
 +local found=false
  cat $inp \
   | egrep --color=always -- $arg \
 -  $tmp  printf \n$msg\n
 -cat $tmp
 +  $tmp  found=true
 +
 +if $found; then

   cat $inp \
| egrep --color=always -- $arg \
 $tmp  {
  printf \n$msg\n
  cat $tmp
}

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: [match-and-simplify] fix incorrect code-gen in 'for' pattern

2015-05-18 Thread Richard Biener
On Sat, 16 May 2015, Prathamesh Kulkarni wrote:

 Hi,
 genmatch generates incorrect code for following (artificial) pattern:
 
 (for op (plus)
   op2 (op)
   (simplify
 (op @x @y)
 (op2 @x @y)
 
 generated gimple code: http://pastebin.com/h1uau9qB
 'op' is not replaced in the generated code on line 33:
 *res_code = op;
 
 I think it would be a better idea to make op2 iterate over same set
 of operators (op2-substitutes = op-substitutes).
 I have attached patch for the same.
 Bootstrap + testing in progress on x86_64-unknown-linux-gnu.
 OK for trunk after bootstrap+testing completes ?

Hmm, but then the example could as well just use 'op'.  I think we
should instead reject this.

Consider

  (for op (plus minus)
(for op2 (op)
  (simplify ...

where it is not clear what would be desired.  Simple replacement
of 'op's value would again just mean you could have used 'op' in
the first place.  Doing what you propose would get you

  (for op (plus minus)
(for op2 (plus minus)
  (simplify ...

thus a different iteration.

 I wonder if we really need is_oper_list flag in user_id ?
 We can determine if user_id is an operator list
 if user_id::substitutes is not empty ?

After your change yes.

 That will lose the ability to distinguish between user-defined operator
 list and list-iterator in for like op/op2, but I suppose we (so far) don't
 need to distinguish between them ?

Well, your change would simply make each list-iterator a (temporary)
user-defined operator list as well as the current iterator element
(dependent on context - see the nested for example).  I think that
adds to confusion.

So - can you instead reject this use?

Thanks,
Richard.


[patch, gcc 5 regression] re-enable biarch for powerpc-linux-gnu

2015-05-18 Thread Sandra Loosemore
We've found that configuring a powerpc-linux-gnu cross toolchain with 
--enable-targets=all no longer enables -m64 support in GCC 5, due to the 
patch for PR target/65286.  (We want to build with a -m64 multilib, in 
particular.)


The attached patch seems to fix the breakage, but I'm not sure that it 
might not break some other configuration.  If this isn't the right fix, 
can one of the target experts suggest a better one?


Here's a link to the discussion of the patch that caused the breakage:

https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00321.html

-Sandra

2015-05-18  Sandra Loosemore  san...@codesourcery.com

	gcc/
	* config.gcc [powerpc*-*-linux*]: Allow --enable-targets=all
	to build a biarch toolchain again.
Index: gcc/config.gcc
===
--- gcc/config.gcc	(revision 449971)
+++ gcc/config.gcc	(working copy)
@@ -2376,6 +2379,7 @@ powerpc*-*-linux*)
 	maybe_biarch=${cpu_is_64bit}
 	case ${enable_targets} in
 	*powerpc64*) maybe_biarch=yes ;;
+	all) maybe_biarch=yes ;;
 	esac
 	case ${target} in
 	powerpc64*-*-linux*spe* | powerpc64*-*-linux*paired*)


C++ PATCH to print_tree TRAIT_EXPR

2015-05-18 Thread Jason Merrill
I noticed that debug_tree wasn't printing the types or code of a 
TRAIT_EXPR, so this fixes that.  I'm not bothering to turn the kind into 
an enum name at the moment.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 6baa86e88580ee3e4f5a322fc530427169da2227
Author: Jason Merrill ja...@redhat.com
Date:   Fri May 15 15:27:51 2015 -0400

	* ptree.c (cxx_print_xnode): Handle TRAIT_EXPR.

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 2d0b584..fd71bb4 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -271,6 +271,13 @@ cxx_print_xnode (FILE *file, tree node, int indent)
   print_node (file, pattern, DEFERRED_NOEXCEPT_PATTERN (node), indent+4);
   print_node (file, args, DEFERRED_NOEXCEPT_ARGS (node), indent+4);
   break;
+case TRAIT_EXPR:
+  indent_to (file, indent+4);
+  fprintf (file, kind %d, TRAIT_EXPR_KIND (node));
+  print_node (file, type 1, TRAIT_EXPR_TYPE1 (node), indent+4);
+  if (TRAIT_EXPR_TYPE2 (node))
+	print_node (file, type 2, TRAIT_EXPR_TYPE2 (node), indent+4);
+  break;
 case LAMBDA_EXPR:
   cxx_print_lambda_node (file, node, indent);
   break;


[PATCH] Refactoring: use std::swap instead of manual swaps

2015-05-18 Thread Mikhail Maltsev
Hi all.
The attached patch replaces the following typical sequence of
assignments, matching the pattern:

[type] temp_var = value1;
value1 = value2;
value2 = temp_var;

with

std::swap (value1, value2);

The goal is to improve code readability (at least a little).

Pattern was searched in GCC tree using a script, but all modifications
have been done manually to make sure that removing assignment to
temp_var does not modify the behavior of the code. Where possible,
temporary variables were removed or moved to blocks with narrower scope.
I also removed a couple of pre-C++ implementations of swap function and
one implementation of std::reverse.

Bootstrapped/regtested on x86_64 linux (multilib). Test for broken
builds currently running. I will also run test on a couple of
simulators (though currently there is a problem with build:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66181).

OK for trunk after complete test?

-- 
Regards,
Mikhail Maltsev
gcc/c/ChangeLog:

2015-05-18  Mikhail Maltsev  malts...@gmail.com

	* c-typeck.c (build_array_ref): Use std::swap instead of explicit
	swaps

gcc/ChangeLog:

2015-05-18  Mikhail Maltsev  malts...@gmail.com

	* bb-reorder.c (fix_up_fall_thru_edges): Use std::swap instead of
	explicit swaps.
	* dojump.c (do_compare_rtx_and_jump): Likewise.
	* expmed.c (emit_store_flag_1): Likewise.
	* fibonacci_heap.h (fibonacci_heap::union_with): Likewise.
	* final.c (sprint_ul): Use std::reverse for reversing a string.
	* fold-const.c (extract_muldiv_1): Use std::swap.
	* genmodes.c (emit_mode_int_n): Likewise.
	* ifcvt.c (dead_or_predicable): Likewise.
	* ira-build.c (ira_merge_live_ranges): Likewise.
	(swap_allocno_copy_ends_if_necessary): Likewise.
	* ira.c (ira_setup_alts): Likewise.
	* loop-iv.c (iv_analyze_expr): Likewise.
	(implies_p): Likewise.
	(canon_condition): Likewise.
	* lra-constraints.c (swap_operands): Likewise.
	* lra-lives.c (lra_merge_live_ranges): Likewise.
	* omega.c (swap): Remove.
	(bswap): Remove.
	(omega_unprotect_1): Use std::swap.
	(omega_solve_geq): Likewise.
	* optabs.c (expand_binop_directly): Likewise.
	(expand_binop): Likewise.
	(emit_conditional_move): Likewise.
	(emit_conditional_add): Likewise.
	* postreload.c (reload_cse_simplify_operands): Likewise.
	* reg-stack.c (emit_swap_insn): Likewise.
	(swap_to_top): Likewise.
	(compare_for_stack_reg): Likewise.
	(subst_asm_stack_regs): Likewise.
	* reload.c (find_reloads): Likewise.
	* reload1.c (gen_reload_chain_without_interm_reg_p): Likewise.
	* sel-sched.c (invoke_reorder_hooks): Likewise.
	(create_block_for_bookkeeping): Likewise.
	* tree-data-ref.c (lambda_matrix_row_exchange): Remove.
	(lambda_matrix_right_hermite): Use std::swap.
	* tree-ssa-coalesce.c (sort_coalesce_list): Likewise.
	* tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
	* tree-ssa-loop-ivopts.c (iv_ca_delta_reverse): Likewise.
	* tree-ssa-math-opts.c (is_widening_mult_p): Likewise.
	* tree-ssa-phiopt.c (hoist_adjacent_loads): Likewise.
	* tree-ssa-reassoc.c (linearize_expr_tree): Likewise.
	* tree-ssa-threadedge.c (simplify_control_stmt_condition): Likewise.
	* tree-vrp.c (compare_ranges): Likewise.
	* var-tracking.c (add_with_sets): Likewise.
	(vt_find_locations): Likewise.


gcc/cp/ChangeLog:

2015-05-18  Mikhail Maltsev  malts...@gmail.com

	* typeck.c (composite_pointer_type): Use std::swap instead of explicit
	swaps.

gcc/fortran/ChangeLog:

2015-05-18  Mikhail Maltsev  malts...@gmail.com

	* interface.c (compare_actual_formal): Use std::swap instead of
	explicit swaps.
	* trans-array.c (gfc_trans_scalarized_loop_end): Likewise.
	* trans-intrinsic.c (walk_inline_intrinsic_transpose): Likewise.

gcc/c-family/ChangeLog:

2015-05-18  Mikhail Maltsev  malts...@gmail.com

	* c-common.c (shorten_compare): Use std::swap instead of explicit swaps.

diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index c134712..a97c2cd 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -1812,7 +1812,6 @@ fix_up_fall_thru_edges (void)
   edge succ2;
   edge fall_thru;
   edge cond_jump = NULL;
-  edge e;
   bool cond_jump_crosses;
   int invert_worked;
   rtx_insn *old_jump;
@@ -1901,9 +1900,7 @@ fix_up_fall_thru_edges (void)
 			  fall_thru-flags = ~EDGE_FALLTHRU;
 			  cond_jump-flags |= EDGE_FALLTHRU;
 			  update_br_prob_note (cur_bb);
-			  e = fall_thru;
-			  fall_thru = cond_jump;
-			  cond_jump = e;
+			  std::swap (fall_thru, cond_jump);
 			  cond_jump-flags |= EDGE_CROSSING;
 			  fall_thru-flags = ~EDGE_CROSSING;
 			}
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 8c7fdd2..ae5564e 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -4372,20 +4372,12 @@ shorten_compare (location_t loc, tree *op0_ptr, tree *op1_ptr,
!integer_zerop (primop1)  !real_zerop (primop1)
!fixed_zerop (primop1))
 {
-  tree tem = primop0;
-  int temi = unsignedp0;
-  primop0 = primop1;
-  primop1 = tem;
-  tem = op0;
-  op0 = op1;
-  op1 = tem;
+  std::swap 

Re: [PATCH 02/13] optabs: Fix vec_perm - V16QI middle end lowering.

2015-05-18 Thread Richard Henderson
On 05/11/2015 06:23 AM, Andreas Krebbel wrote:
 @@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, 
 rtx sel, rtx target)
  {
/* Multiply each element by its byte size.  */
machine_mode selmode = GET_MODE (sel);
 +  /* We cannot re-use SEL as a temp operand since it might by in
 +  read-only storage.  */
 +  rtx sel_reg = gen_reg_rtx (selmode);
 +
if (u == 2)
 - sel = expand_simple_binop (selmode, PLUS, sel, sel,
 -sel, 0, OPTAB_DIRECT);
 + sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
 +sel_reg, 0, OPTAB_DIRECT);
else

You needn't allocate sel_reg explicitly; expand_simple_binop will do that for
you if the TARGET parameter is NULL.

Thus this patch should be an 8 character change on those two calls.


r~


C++ PATCH for core DR 1391

2015-05-18 Thread Jason Merrill
The proposed resolution of core issue 1391 is problematic, as I 
discovered when trying to implement it, but this variant seems like a 
strict improvement: rather than consider non-dependent conversions 
during deduction, wait until we've tried to deduce all arguments first.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit cd4859011b50c864c2d027ce443f241301243ef5
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 13 19:32:15 2015 -0400

	DR 1391
	* pt.c (type_unification_real): Check convertibility here.
	(unify_one_argument): Not here.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7871474..2cd36c9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16678,7 +16678,7 @@ uses_deducible_template_parms (tree type)
 
 static int
 unify_one_argument (tree tparms, tree targs, tree parm, tree arg,
-		int subr, unification_kind_t strict, int flags,
+		int subr, unification_kind_t strict,
 		bool explain_p)
 {
   tree arg_expr = NULL_TREE;
@@ -16695,16 +16695,10 @@ unify_one_argument (tree tparms, tree targs, tree parm, tree arg,
  argument to convert it to the type of the corresponding function
  parameter if the parameter type contains no template-parameters that
  participate in template argument deduction.  */
-  if (TYPE_P (parm)  !uses_template_parms (parm))
-/* For function parameters that contain no template-parameters at all,
-   we have historically checked for convertibility in order to shortcut
-   consideration of this candidate.  */
-return check_non_deducible_conversion (parm, arg, strict, flags,
-	   explain_p);
-  else if (strict == DEDUCE_CALL
-	TYPE_P (parm)  !uses_deducible_template_parms (parm))
-/* For function parameters with only non-deducible template parameters,
-   just return.  */
+  if (strict != DEDUCE_EXACT
+   TYPE_P (parm)  !uses_deducible_template_parms (parm))
+/* For function parameters with no deducible template parameters,
+   just return.  We'll check non-dependent conversions later.  */
 return unify_success (explain_p);
 
   switch (strict)
@@ -16843,7 +16837,7 @@ type_unification_real (tree tparms,
   ++ia;
 
   if (unify_one_argument (tparms, targs, parm, arg, subr, strict,
-			  flags, explain_p))
+			  explain_p))
 	return 1;
 }
 
@@ -16925,8 +16919,11 @@ type_unification_real (tree tparms,
 	 this parameter can be deduced.  */
 	  if (TREE_CODE (tparm) == PARM_DECL
 	   uses_template_parms (TREE_TYPE (tparm))
-	   !saw_undeduced++)
-	goto again;
+	   saw_undeduced  2)
+	{
+	  saw_undeduced = 1;
+	  continue;
+	}
 
 	  /* Core issue #226 (C++0x) [temp.deduct]:
 
@@ -16937,32 +16934,9 @@ type_unification_real (tree tparms,
 	 be NULL_TREE or ERROR_MARK_NODE, so we do not need
 	 to explicitly check cxx_dialect here.  */
 	  if (TREE_PURPOSE (TREE_VEC_ELT (tparms, i)))
-	{
-	  tree parm = TREE_VALUE (TREE_VEC_ELT (tparms, i));
-	  tree arg = TREE_PURPOSE (TREE_VEC_ELT (tparms, i));
-	  reopen_deferring_access_checks (*checks);
-	  location_t save_loc = input_location;
-	  if (DECL_P (parm))
-		input_location = DECL_SOURCE_LOCATION (parm);
-	  arg = tsubst_template_arg (arg, targs, complain, NULL_TREE);
-	  arg = convert_template_argument (parm, arg, targs, complain,
-	   i, NULL_TREE);
-	  input_location = save_loc;
-	  *checks = get_deferred_access_checks ();
-	  pop_deferring_access_checks ();
-	  if (arg == error_mark_node)
-		return 1;
-	  else
-		{
-		  TREE_VEC_ELT (targs, i) = arg;
-		  /* The position of the first default template argument,
-		 is also the number of non-defaulted arguments in TARGS.
-		 Record that.  */
-		  if (!NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs))
-		SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs, i);
-		  continue;
-		}
-	}
+	/* OK, there is a default argument.  Wait until after the
+	   conversion check to do substitution.  */
+	continue;
 
 	  /* If the type parameter is a parameter pack, then it will
 	 be deduced to an empty parameter pack.  */
@@ -16987,6 +16961,84 @@ type_unification_real (tree tparms,
 
 	  return unify_parameter_deduction_failure (explain_p, tparm);
 	}
+
+  /* DR 1391: All parameters have args, now check non-dependent parms for
+	 convertibility.  */
+  if (saw_undeduced  2)
+	for (ia = 0, parms = xparms, args = xargs, nargs = xnargs;
+	 parms  parms != void_list_node  ia  nargs; )
+	  {
+	parm = TREE_VALUE (parms);
+
+	if (TREE_CODE (parm) == TYPE_PACK_EXPANSION
+		 (!TREE_CHAIN (parms)
+		|| TREE_CHAIN (parms) == void_list_node))
+	  /* For a function parameter pack that occurs at the end of the
+		 parameter-declaration-list, the type A of each remaining
+		 argument of the call is compared with the type P of the
+		 declarator-id of the function parameter pack.  */
+	  break;
+
+	parms = TREE_CHAIN (parms);
+
+	if (TREE_CODE (parm) == 

Re: [PATCH] FreeBSD add functionality to build PIE executables.

2015-05-18 Thread Jeff Law

On 05/17/2015 02:42 PM, Andreas Tobler wrote:

Ping?!

Thanks,
Andreas

On 11.05.15 22:34, Andreas Tobler wrote:

All,

this patch adds the ability to build PIE executables for FreeBSD. The
core is since a longer time in the code base of FreeBSD itself and is
working fine.

This patch makes it available for all FreeBSD targets.
Tested on x86_64-*-freebsd11.0 and armv6/hf-*-freebsd11.0,
i386-*-freebsd11.0 is progress.

In the same turn I removed the STARTFILE/ENDFILE_SPEC from
config/i386/freebsd.h and use the ones from config/freebsd-spec.h.

Here the results before the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01267.html

and with the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01324.html

Is this ok for trunk and for 5.1X?

Thanks,
Andreas

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

* config/freebsd-spec.h (FBSD_STARTFILE_SPEC): Add the bits to build
pie executables.
(FBSD_ENDFILE_SPEC): Likewise.
* config/i386/freebsd.h (STARTFILE_SPEC): Remove and use the one from
config/freebsd-spec.h.
(ENDFILE_SPEC): Likewise.

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

* lib/target-supports.exp (check_effective_target_pie): Add
*-*-freebsd*
to the familiy of pie capable targets.
OK for the trunk, please install.  Release managers have final say for 
the release branches.


jeff


Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread Michael Matz
Hi,

On Mon, 18 May 2015, H.J. Lu wrote:

 Yes, we should convert it to
 
 nop call foo/jmp foo nop
 
 I implemented it on users/hjl/relax branch in binutils git repo.
 
  the insn decoder.  For calls as well of course, but there it might be
  better to have it before the call.
 
 
 I think a nop prefix is better on call.  We won't  mandate nop call foo
 in psABI and linker is free to use either a nop prefix or a nop suffix.
 
 Should we move forward with it?

I think I have only nit-picking left: why call the whole thing relax?  To 
me relax implies some length-changing transformation (like jump target 
relaxing, emitting shorter jumps when in range), but perhaps that's just 
me.  OTOH I can't think of a better name right now.

Then one remark: there's a small interaction between this scheme and 
taking the address of a function.  I _think_ it's all taken care of, but 
just want to make sure it is: the relax scheme must only apply to the 
.got.plt slot, not to the normal .got slot (which must continue to hold 
the final function address), and with the recent sharing you have 
implemented (when both are needed) it must be ensured that also an 
existing RELAX_GOTPCREL reloc doesn't overwrite that .got slot with the 
.plt entry address.


Ciao,
Michael.


Re: RFC: Add a new relocation to x86-64/i386 psABIs

2015-05-18 Thread H.J. Lu
On Mon, May 18, 2015 at 8:52 AM, Michael Matz m...@suse.de wrote:
 Hi,

 On Mon, 18 May 2015, H.J. Lu wrote:

 Yes, we should convert it to

 nop call foo/jmp foo nop

 I implemented it on users/hjl/relax branch in binutils git repo.

  the insn decoder.  For calls as well of course, but there it might be
  better to have it before the call.
 

 I think a nop prefix is better on call.  We won't  mandate nop call foo
 in psABI and linker is free to use either a nop prefix or a nop suffix.

 Should we move forward with it?

 I think I have only nit-picking left: why call the whole thing relax?  To
 me relax implies some length-changing transformation (like jump target
 relaxing, emitting shorter jumps when in range), but perhaps that's just
 me.  OTOH I can't think of a better name right now.

Me neither.  Linker manual has

'--relax'
'--no-relax'
 An option with machine dependent effects.  This option is only
 supported on a few targets.  *Note 'ld' and the H8/300: H8/300.
 *Note 'ld' and the Intel 960 family: i960.  *Note 'ld' and Xtensa
 Processors: Xtensa.  *Note 'ld' and the 68HC11 and 68HC12:
 M68HC11/68HC12.  *Note 'ld' and the Altera Nios II: Nios II.  *Note
 'ld' and PowerPC 32-bit ELF Support: PowerPC ELF32.

 On some platforms the '--relax' option performs target specific,
 global optimizations that become possible when the linker resolves
 addressing in the program, such as relaxing address modes,
 synthesizing new instructions, selecting shorter version of current
 instructions, and combining constant values.

 On some platforms these link time global optimizations may make
 symbolic debugging of the resulting executable impossible.  This is
 known to be the case for the Matsushita MN10200 and MN10300 family
 of processors.

In this sense, calling this scheme relax is not entirely inaccurate.

 Then one remark: there's a small interaction between this scheme and
 taking the address of a function.  I _think_ it's all taken care of, but

I believe so.

 just want to make sure it is: the relax scheme must only apply to the
 .got.plt slot, not to the normal .got slot (which must continue to hold
 the final function address), and with the recent sharing you have
 implemented (when both are needed) it must be ensured that also an
 existing RELAX_GOTPCREL reloc doesn't overwrite that .got slot with the
 .plt entry address.

Since .got and .got.plt slots serve the same purpose, ld already
combines them into one single .got slot with .plt entry pointing to
the .got slot.  That is if you take the address of the function and
branch to it, linker will arrange .plt entry to do an indirect branch
via its got slot.

-- 
H.J.


Re: [PATCH] Fix memory orders description in atomic ops built-ins docs.

2015-05-18 Thread Matthew Wahab

Hello,

On 15/05/15 17:22, Torvald Riegel wrote:

This patch improves the documentation of the built-ins for atomic
operations.


The memory model to memory order change does improve things but I think that
the patch has some problems. As it is now, it makes some of the descriptions
quite difficult to understand and seems to assume more familiarity with details
of the C++11 specification then might be expected.

Generally, the memory order descriptions seem to be targeted towards language
designers but don't provide for anybody trying to understand how to implement or
to use the built-ins. Adding a less formal, programmers view to some of the
descriptions would help. That implies the descriptions would be more than just
illustrative, but I'd suggest that would be appropriate for the GCC manual.

I'm also not sure that the use of C++11 terms in the some of the
descriptions. In particular, using happens-before seems wrong because
happens-before isn't described anywhere in the GCC manual and because it has a
specific meaning in the C++11 specification that doesn't apply to the GCC
built-ins (which C++11 doesn't know about).

Some more comments below.

Regards,
Matthew


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 6004681..5b2ded8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8853,19 +8853,19 @@ are not prevented from being speculated to before the 
barrier.

 [...]  If the data type size maps to one
-of the integral sizes that may have lock free support, the generic
-version uses the lock free built-in function.  Otherwise an
+of the integral sizes that may support lock-freedom, the generic
+version uses the lock-free built-in function.  Otherwise an
 external call is left to be resolved at run time.

=
This is a slightly awkward sentence. Maybe it could be replaced with something
on the lines of The generic function uses the lock-free built-in function when
the data-type size makes that possible, otherwise an external call is left to be
resolved at run-time.
=

-The memory models integrate both barriers to code motion as well as
-synchronization requirements with other threads.  They are listed here
-in approximately ascending order of strength.
+An atomic operation can both constrain code motion by the compiler and
+be mapped to a hardware instruction for synchronization between threads
+(e.g., a fence).  [...]

=
This is a little unclear (and inaccurate, aarch64 can use two instructions
for fences). I also thought that atomic operations constrain code motion by the
hardware. Maybe break the link with the compiler and hardware: An atomic
operation can both constrain code motion and act as a synchronization point
between threads.
=

 @table  @code
 @item __ATOMIC_RELAXED
-No barriers or synchronization.
+Implies no inter-thread ordering constraints.


It may be useful to be explicit that there are no restrctions on code motion.


 @item __ATOMIC_CONSUME
-Data dependency only for both barrier and synchronization with another
-thread.
+This is currently implemented using the stronger @code{__ATOMIC_ACQUIRE}
+memory order because of a deficiency in C++11's semantics for
+@code{memory_order_consume}.

=
It would be useful to have a description of what the __ATOMIC_CONSUME was
meant to do, as well as the fact that it currently just maps to
__ATOMIC_ACQUIRE. (Or maybe just drop it from the documentation until it's
fixed.)
=

 @item __ATOMIC_ACQUIRE
-Barrier to hoisting of code and synchronizes with release (or stronger)
-semantic stores from another thread.
+Creates an inter-thread happens-before constraint from the release (or
+stronger) semantic store to this acquire load.  Can prevent hoisting
+of code to before the operation.

=
As noted before, it's not clear what the inter-thread happens-before
means in this context.

Here and elsewhere:
Can prevent motion of code is ambiguous: it doesn't say under what
conditions code would or wouldn't be prevented from moving.
=

-Note that the scope of a C++11 memory model depends on whether or not
-the function being called is a @emph{fence} (such as
-@samp{__atomic_thread_fence}).  In a fence, all memory accesses are
-subject to the restrictions of the memory model.  When the function is
-an operation on a location, the restrictions apply only to those
-memory accesses that could affect or that could depend on the
-location.
+Note that in the C++11 memory model, @emph{fences} (e.g.,
+@samp{__atomic_thread_fence}) take effect in combination with other
+atomic operations on specific memory locations (e.g., atomic loads);
+operations on specific memory locations do not necessarily affect other
+operations in the same way.


Its very unclear what this paragraph is saying. It seems to suggest that fences
only work in combination with other operations. But that doesn't seem right
since a __atomic_thread_fence (with appropriate memory order) can be dropped
into any piece of code and will act in the 

C++ PATCH for bugs in strip_typedefs*

2015-05-18 Thread Jason Merrill
While looking at a bug report caused by failure to strip all typedefs, I 
came across a few latent bugs in the strip_typedefs functions.


First, if we're going to return a TREE_LIST unchanged we need to copy 
the pointer before we change the input variable.


Second, if we have a decltype with no typedefs in the argument, we still 
need to handle a typedef to that decltype.


Third, when we strip the operands of a TRAIT_EXPR we want to update the 
operands of the copy, not the original.


Tested x86_64-pc-linux-gnu, applying to trunk and 5.
commit a5bfb34605c5ea629db6ff6db76cdaed0f675670
Author: Jason Merrill ja...@redhat.com
Date:   Fri May 15 15:28:54 2015 -0400

	* tree.c (strip_typedefs_expr) [TRAIT_EXPR]: Fix typo.
	(strip_typedefs) [DECLTYPE_TYPE]: Fix typedef of decltype.
	[TREE_LIST]: Fix no-change case.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index ec9be8c..eebb415 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1265,6 +1265,7 @@ strip_typedefs (tree t, bool *remove_attributes)
 {
   bool changed = false;
   vectree,va_gc *vec = make_tree_vector ();
+  tree r = t;
   for (; t; t = TREE_CHAIN (t))
 	{
 	  gcc_assert (!TREE_PURPOSE (t));
@@ -1273,7 +1274,6 @@ strip_typedefs (tree t, bool *remove_attributes)
 	changed = true;
 	  vec_safe_push (vec, elt);
 	}
-  tree r = t;
   if (changed)
 	r = build_tree_list_vec (vec);
   release_tree_vector (vec);
@@ -1411,7 +1411,7 @@ strip_typedefs (tree t, bool *remove_attributes)
   result = strip_typedefs_expr (DECLTYPE_TYPE_EXPR (t),
 remove_attributes);
   if (result == DECLTYPE_TYPE_EXPR (t))
-	return t;
+	result = NULL_TREE;
   else
 	result = (finish_decltype_type
 		  (result,
@@ -1496,8 +1496,8 @@ strip_typedefs_expr (tree t, bool *remove_attributes)
 	 type2 == TRAIT_EXPR_TYPE2 (t))
 	  return t;
 	r = copy_node (t);
-	TRAIT_EXPR_TYPE1 (t) = type1;
-	TRAIT_EXPR_TYPE2 (t) = type2;
+	TRAIT_EXPR_TYPE1 (r) = type1;
+	TRAIT_EXPR_TYPE2 (r) = type2;
 	return r;
   }
 


Re: [PATCH, 5/5] check_GNU_style.sh: Fix tab size in 80 characters check

2015-05-18 Thread Jeff Law

On 05/18/2015 02:07 AM, Tom de Vries wrote:

On 12-05-15 17:16, Jeff Law wrote:

[PATCH 5/5] check_GNU_style.sh: Fix tab size in 80 characters check

2015-05-11  Tom de Vriest...@codesourcery.com

* check_GNU_style.sh (col): Fix tab size.

OK.


Hi Jeff,

I.

I noticed a performance degradation due to this patch:
...
$ cat gcc/tree-ssa-tail-merge.c | awk '{printf +%s\n, $0}' | time -p
./contrib/check_GNU_style.sh -
   ...
real 4.10
user 0.71
sys 6.77
...

Before this patch, the 'real' time was roughly a factor 80 smaller:
...
real 0.05
user 0.02
sys 0.03
...

This degradation is due to the fact that the patch does the 80 chars
check line-by-line, and invokes processes for each new line.


II.

Attached follow-up patch rewrites the 80 chars check to handle a file at
a time rather than a line at a time, and gets performance back to normal:
...
real 0.07
user 0.03
sys 0.05
...

As a bonus, the bit longer than 80 chars is now printed in red, similar
to how the other checks show the output.

OK for trunk?

Yes, this is fine.

jeff


[PATCH, alpha]: Fix PR57032, ICE with TARGET_LRA_P

2015-05-18 Thread Uros Bizjak
Hello!

Attached patch is the first small step towards LRA enablement on
alpha. As suggester by Richard, the patch rewrites Q constraint as
define_memory_constraint, so reload can push AND address to a
register.

I will close the referred PR, as it is not a regression and open a new
one to track LRA enablement progress.

2015-05-18  Uros Bizjak  ubiz...@gmail.com
Richard Henderson  r...@redhat.com

PR target/57032
* config/alpha/constraints.md (Q): Rewrite as define_memory_constraint.
Check for a memory location that is not a reference (using an AND)
to an unaligned location here.
* config/alpha/predicates.md (normal_memory_operand): Remove.

Tested on alpha-linux-gnu and committed to mainline SVN.

Uros.
Index: config/alpha/constraints.md
===
--- config/alpha/constraints.md (revision 223268)
+++ config/alpha/constraints.md (working copy)
@@ -91,9 +91,13 @@
(match_test op == CONST0_RTX (mode
 
 ;; Extra constraints.
-(define_constraint Q
+
+;; A memory location that is not a reference
+;; (using an AND) to an unaligned location.
+(define_memory_constraint Q
   @internal A normal_memory_operand
-  (match_operand 0 normal_memory_operand))
+  (and (match_code mem)
+   (not (match_test GET_CODE (XEXP (op, 0)) == AND
 
 (define_constraint R
   @internal A direct_call_operand
Index: config/alpha/predicates.md
===
--- config/alpha/predicates.md  (revision 223268)
+++ config/alpha/predicates.md  (working copy)
@@ -525,14 +525,6 @@
   return false;
 })
 
-;; Return 1 is OP is a memory location that is not a reference
-;; (using an AND) to an unaligned location.  Take into account
-;; what reload will do.
-(define_special_predicate normal_memory_operand
-  (ior (match_test op = resolve_reload_operand (op), 0)
-   (and (match_code mem)
-   (match_test GET_CODE (XEXP (op, 0)) != AND
-
 ;; Returns 1 if OP is not an eliminable register.
 ;;
 ;; This exists to cure a pathological failure in the s8addq (et al) patterns,


[C++ Patch[ PR 66130

2015-05-18 Thread Paolo Carlini

Hi,

Manuel did most of the work for this rather simple issue filed by Tom: 
essentially, invalid_nonstatic_memfn_p gets a location_t parameter which 
is used to pass the location of the place where the use of the nonstatic 
member function is indeed invalid. Besides that, while working on the 
bug we noticed that we must be careful with exprs which aren't DECLs.


Tested x86_64-linux.

Thanks,
Paolo.


2015-05-18  Manuel López-Ibáñez  m...@gcc.gnu.org
Paolo Carlini  paolo.carl...@oracle.com

PR c++/66130
* typeck.c (invalid_nonstatic_memfn_p): Add location_t parameter and
use it in the diagnostic.
(decay_conversion): Adjust call.
* semantics.c (finish_decltype_type): Likewise.
* call.c (resolve_args, build_new_op_1,
perform_implicit_conversion_flags): Adjust calls.
* cvt.c (ocp_convert, convert_to_void): Likewise.
* cp-tree.h (invalid_nonstatic_memfn_p): Update declaration.

2015-05-18  Manuel López-Ibáñez  m...@gcc.gnu.org
Paolo Carlini  paolo.carl...@oracle.com

PR c++/66130
* g++.dg/other/pr66130.C: New.
* g++.dg/cpp0x/pr66130.C: Likewise.
Index: cp/call.c
===
--- cp/call.c   (revision 223295)
+++ cp/call.c   (working copy)
@@ -3941,7 +3941,7 @@ resolve_args (vectree, va_gc *args, tsubst_flags
error (invalid use of void expression);
  return NULL;
}
-  else if (invalid_nonstatic_memfn_p (arg, complain))
+  else if (invalid_nonstatic_memfn_p (input_location, arg, complain))
return NULL;
 }
   return args;
@@ -5542,9 +5542,9 @@ build_new_op_1 (location_t loc, enum tree_code cod
/* If one of the arguments of the operator represents
   an invalid use of member function pointer, try to report
   a meaningful error ...  */
-   if (invalid_nonstatic_memfn_p (arg1, tf_error)
-   || invalid_nonstatic_memfn_p (arg2, tf_error)
-   || invalid_nonstatic_memfn_p (arg3, tf_error))
+ if (invalid_nonstatic_memfn_p (loc, arg1, tf_error)
+   || invalid_nonstatic_memfn_p (loc, arg2, tf_error)
+   || invalid_nonstatic_memfn_p (loc, arg3, tf_error))
  /* We displayed the error message.  */;
else
  {
@@ -9409,7 +9409,7 @@ perform_implicit_conversion_flags (tree type, tree
 Call instantiate_type to get good error messages.  */
  if (TREE_TYPE (expr) == unknown_type_node)
instantiate_type (type, expr, complain);
- else if (invalid_nonstatic_memfn_p (expr, complain))
+ else if (invalid_nonstatic_memfn_p (loc, expr, complain))
/* We gave an error.  */;
  else
error_at (loc, could not convert %qE from %qT to %qT, expr,
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 223295)
+++ cp/cp-tree.h(working copy)
@@ -6282,7 +6282,8 @@ extern tree build_address (tree);
 extern tree build_nop  (tree, tree);
 extern tree non_reference  (tree);
 extern tree lookup_anon_field  (tree, tree);
-extern bool invalid_nonstatic_memfn_p  (tree, tsubst_flags_t);
+extern bool invalid_nonstatic_memfn_p  (location_t, tree,
+tsubst_flags_t);
 extern tree convert_member_func_to_ptr (tree, tree, tsubst_flags_t);
 extern tree convert_ptrmem (tree, tree, bool, bool,
 tsubst_flags_t);
Index: cp/cvt.c
===
--- cp/cvt.c(revision 223295)
+++ cp/cvt.c(working copy)
@@ -902,7 +902,7 @@ ocp_convert (tree type, tree expr, int convtype, i
 {
   /* If the conversion failed and expr was an invalid use of pointer to
 member function, try to report a meaningful error.  */
-  if (invalid_nonstatic_memfn_p (expr, complain))
+  if (invalid_nonstatic_memfn_p (loc, expr, complain))
/* We displayed the error message.  */;
   else
error_at (loc, conversion from %qT to non-scalar type %qT requested,
@@ -960,7 +960,7 @@ convert_to_void (tree expr, impl_conv_void implici
 
   if (!TREE_TYPE (expr))
 return expr;
-  if (invalid_nonstatic_memfn_p (expr, complain))
+  if (invalid_nonstatic_memfn_p (loc, expr, complain))
 return error_mark_node;
   if (TREE_CODE (expr) == PSEUDO_DTOR_EXPR)
 {
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 223295)
+++ cp/semantics.c  (working copy)
@@ -7242,7 +7242,7 @@ finish_decltype_type (tree expr, bool id_expressio
 
   expr = 

Re: [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack

2015-05-18 Thread David Edelsohn
On Sun, May 17, 2015 at 10:54 PM, Alan Modra amo...@gmail.com wrote:
 This patch changes rs6000_stack_info to keep save areas offsets even
 when not used.  I need lr_save_offset valid for split-stack, and it
 seemed reasonable to treat the other offsets the same.  Not zeroing
 the offsets requires just one change in code that uses them, the
 use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not
 counting the debug_stack_info changes.

 * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets
 when not saving registers.
 (debug_stack_info): Adjust to omit printing unused offsets,
 as before.
 (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp
 expression.

I think that the vrsave_save_offset change may break saving of
callee-saved VRs.  See PR 55276.  Additional points for converting the
testcase into one that can be included in the GCC testsuite.

Thanks, David


[0/9] Record number of hard registers in a REG

2015-05-18 Thread Richard Sandiford
While looking at a profile of gcc, I noticed one thing fairly high
up the list was a loop iterating over all the registers in a REG,
apparently due to the delay in computing the index for hard_regno_nregs
and then loading the value (which would often be an L1 cache miss).

When we were adding CONST_WIDE_INT, the general opinion seemed to be
that we should lay out rtxes for LP64 hosts rather than try to have two
alternative layouts, one optimised for ILP32 and one for LP64.  We therefore
unconditionally filled the 32-bit hole (on LP64) between the rtx header and
the main union with extra data.  That area is already used by REGs to store
ORIGINAL_REGNO, but on LP64 hosts there's another hole in the REGNO
field itself.  This series takes that idea a step further and uses the
hole to store the number of registers in a REG.

This still leaves 24 redundant bits that could be used for other things
in future.  That's actually enough to store a SUBREG of a REG (8 bits
for the inner mode, 16 for the offset), but having a single rtx for that
would probably cause too many problems.

The series sped up an --enable-checking=release gcc by just over 0.5%
for various tests on my box.  Not a big saving, but hopefully the
patches also count as a clean-up.

As a follow-on, I'd like to add a FOR_EACH_* macro that iterates over
all the registers in a REG.  These loops always execute at least once,
and rarely more than once, and it would be good to model that in the
iterator so that all use sites benefit.

Each patch in the series was individually bootstrapped  regression-tested
on x86_64-linux-gnu.

Thanks,
Richard



[1/9] Make more use of END_REGNO

2015-05-18 Thread Richard Sandiford
This patch just handles some obvious cases where END_REGNO could be used.


gcc/
* cfgcleanup.c (mentions_nonequal_regs): Use END_REGNO.
* dse.c (note_add_store): Likewise.
* ira-lives.c (mark_hard_reg_dead): Likewise.
* loop-invariant.c (mark_reg_store): Likewise.
(mark_reg_death): Likewise.
* postreload.c (reload_combine): Likewise.
(reload_combine_note_store): Likewise.
(reload_combine_note_use): Likewise.
* recog.c (peep2_reg_dead_p): Likewise.

Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c2015-05-18 08:19:38.719489341 +0100
+++ gcc/cfgcleanup.c2015-05-18 08:19:52.0 +0100
@@ -267,26 +267,20 @@ mark_effect (rtx exp, regset nonequal)
 mentions_nonequal_regs (const_rtx x, regset nonequal)
 {
   subrtx_iterator::array_type array;
   FOR_EACH_SUBRTX (iter, array, x, NONCONST)
 {
   const_rtx x = *iter;
   if (REG_P (x))
{
- unsigned int regno = REGNO (x);
- if (REGNO_REG_SET_P (nonequal, regno))
-   return true;
- if (regno  FIRST_PSEUDO_REGISTER)
-   {
- int n = hard_regno_nregs[regno][GET_MODE (x)];
- while (--n  0)
-   if (REGNO_REG_SET_P (nonequal, regno + n))
- return true;
-   }
+ unsigned int end_regno = END_REGNO (x);
+ for (unsigned int regno = REGNO (x); regno  end_regno; ++regno)
+   if (REGNO_REG_SET_P (nonequal, regno))
+ return true;
}
 }
   return false;
 }
 
 /* Attempt to prove that the basic block B will have no side effects and
always continues in the same edge if reached via E.  Return the edge
if exist, NULL otherwise.  */
Index: gcc/dse.c
===
--- gcc/dse.c   2015-05-18 08:19:38.719489341 +0100
+++ gcc/dse.c   2015-05-18 08:19:52.0 +0100
@@ -850,17 +850,16 @@ typedef struct
 /* Callback for emit_inc_dec_insn_before via note_stores.
Check if a register is clobbered which is live afterwards.  */
 
 static void
 note_add_store (rtx loc, const_rtx expr ATTRIBUTE_UNUSED, void *data)
 {
   rtx_insn *insn;
   note_add_store_info *info = (note_add_store_info *) data;
-  int r, n;
 
   if (!REG_P (loc))
 return;
 
   /* If this register is referenced by the current or an earlier insn,
  that's OK.  E.g. this applies to the register that is being incremented
  with this addition.  */
   for (insn = info-first;
@@ -869,25 +868,24 @@ note_add_store (rtx loc, const_rtx expr
 if (reg_referenced_p (loc, PATTERN (insn)))
   return;
 
   /* If we come here, we have a clobber of a register that's only OK
  if that register is not live.  If we don't have liveness information
  available, fail now.  */
   if (!info-fixed_regs_live)
 {
-  info-failure =  true;
+  info-failure = true;
   return;
 }
   /* Now check if this is a live fixed register.  */
-  r = REGNO (loc);
-  n = hard_regno_nregs[r][GET_MODE (loc)];
-  while (--n =  0)
-if (REGNO_REG_SET_P (info-fixed_regs_live, r+n))
-  info-failure =  true;
+  unsigned int end_regno = END_REGNO (loc);
+  for (unsigned int regno = REGNO (loc); regno  end_regno; ++regno)
+if (REGNO_REG_SET_P (info-fixed_regs_live, regno))
+  info-failure = true;
 }
 
 /* Callback for for_each_inc_dec that emits an INSN that sets DEST to
SRC + SRCOFF before insn ARG.  */
 
 static int
 emit_inc_dec_insn_before (rtx mem ATTRIBUTE_UNUSED,
  rtx op ATTRIBUTE_UNUSED,
Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c 2015-05-18 08:19:38.719489341 +0100
+++ gcc/ira-lives.c 2015-05-18 08:19:52.0 +0100
@@ -473,17 +473,17 @@ mark_pseudo_regno_subword_dead (int regn
register.  */
 static void
 mark_hard_reg_dead (rtx reg)
 {
   int regno = REGNO (reg);
 
   if (! TEST_HARD_REG_BIT (ira_no_alloc_regs, regno))
 {
-  int last = regno + hard_regno_nregs[regno][GET_MODE (reg)];
+  int last = END_REGNO (reg);
   enum reg_class aclass, pclass;
 
   while (regno  last)
{
  if (TEST_HARD_REG_BIT (hard_regs_live, regno))
{
  aclass = ira_hard_regno_allocno_class[regno];
  pclass = ira_pressure_class_translate[aclass];
Index: gcc/loop-invariant.c
===
--- gcc/loop-invariant.c2015-05-18 08:19:38.719489341 +0100
+++ gcc/loop-invariant.c2015-05-18 08:19:38.719489341 +0100
@@ -1802,68 +1802,44 @@ mark_regno_death (int regno)
   change_pressure (regno, false);
 }
 
 /* Mark setting register REG.  */
 static void
 mark_reg_store (rtx reg, const_rtx setter ATTRIBUTE_UNUSED,
void *data ATTRIBUTE_UNUSED)
 {
-  int regno;
-
   if (GET_CODE (reg) == SUBREG)
 reg = 

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-18 Thread Alexander Monakov
On Fri, 15 May 2015, Jan Hubicka wrote:
   With -fno-plt, we don't have to reject even direct calls as sibcall
   candidates.
  
   This patch depends on '-fplt' flag that is introduced in another patch.
  
   This patch requires that with -fno-plt all sibcall candidates go through
   prepare_call_address that transforms the call to a GOT lookup.
  
   OK?
 * config/i386/i386.c (ix86_function_ok_for_sibcall): Check 
   flag_plt.
  
   diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
   index f29e053..b734350 100644
   --- a/gcc/config/i386/i386.c
   +++ b/gcc/config/i386/i386.c
   @@ -5448,12 +5448,13 @@ ix86_function_ok_for_sibcall (tree decl, tree 
   exp)
  /* If we are generating position-independent code, we cannot sibcall
 optimize any indirect call, or a direct call to a global function,
 as the PLT requires %ebx be live. (Darwin does not have a PLT.)  */
  if (!TARGET_MACHO
   !TARGET_64BIT
   flag_pic
   +   flag_plt
   (decl  !targetm.binds_local_p (decl)))
return false;
  
  /* If we need to align the outgoing stack, then sibcalling would
 unalign the stack, which may break the called function.  */
  if (ix86_minimum_incoming_stack_boundary (true)
  
  
  I think it should be done via psABI change similar to
  
  https://groups.google.com/forum/#!topic/x86-64-abi/n8GYMpqvBxI
  
  which I have implemented on users/hjl/relax branch in binutils.
 
 OK, I am trying to understand how relax branch works and what difference it 
 makes.
 As I underestand it, the main purpose is to be able to make relaxed call of
 
call function
 
 that will, in 64bit mode, either result to RIP relative call with extra NOP 
 just
 before the instruction if FUNCTION binds within the DSO or to indirect call 
 through
 GOT bypassing the PLT.  This saves overhead of PLT and increase every such 
 call
 by extra NOP for no-LTO builds and even in LTO when the symbol is defined but
 interposable.  This is actually really nice trick.
 
 Now this is about 32bit mode where explicit GOT pointer register is needed
 (how this work with large code model on x86-64?). It is needed by PLT, but I 
 suppose
 to implement the same relaxation for 32bit it would need to use EBX to lookup 
 the
 GOT pointer, too, so the check above would still be valid.
 
 The patches makes sense to be given that we support -fno-plt now.

After this message the discussion diverged in the direction of H.J.Lu's
proposed relaxation scheme involving new type of relocations.

I'm not clear if my patch is actually approved.  I'd like to point out that it
doesn't clash with H.J.Lu's work.  It improves codegen by allowing sibcalls in
more circumstances.

Alexander


C++ PATCH for typedef stripping with variable templates

2015-05-18 Thread Jason Merrill
At the Lenexa C++ meeting Walter pointed me at a variable template 
testcase that was breaking, which turned out to be because we were 
failing to strip aliases consistently in variable template arguments, 
leading to mismatches between, in the first testcase, Vfloat and Vfl.


Tested x86_64-pc-linux-gnu, applying to trunk and concepts branch.
commit 87ca1bec448edb74cd531d5b2691a9d58ba17326
Author: Jason Merrill ja...@redhat.com
Date:   Sat May 9 15:50:40 2015 -0500

	* pt.c (tsubst_decl) [VAR_DECL]: Call coerce_innermost_template_parms.
	(determine_specialization): Call coerce_template_parms.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2cd36c9..2166f5f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1919,7 +1919,13 @@ determine_specialization (tree template_id,
 ++header_count;
 
   if (variable_template_p (fns))
-templates = tree_cons (explicit_targs, fns, templates);
+{
+  tree parms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (fns));
+  targs = coerce_template_parms (parms, explicit_targs, fns,
+ tf_warning_or_error,
+ /*req_all*/true, /*use_defarg*/true);
+  templates = tree_cons (targs, fns, templates);
+}
   else for (; fns; fns = OVL_NEXT (fns))
 {
   tree fn = OVL_CURRENT (fns);
@@ -11265,6 +11271,11 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 		tmpl = DECL_TI_TEMPLATE (t);
 		gen_tmpl = most_general_template (tmpl);
 		argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl);
+		if (argvec != error_mark_node)
+		  argvec = (coerce_innermost_template_parms
+			(DECL_TEMPLATE_PARMS (gen_tmpl),
+			 argvec, t, complain,
+			 /*all*/true, /*defarg*/true));
 		if (argvec == error_mark_node)
 		  RETURN (error_mark_node);
 		hash = hash_tmpl_and_args (gen_tmpl, argvec);
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ25.C b/gcc/testsuite/g++.dg/cpp1y/var-templ25.C
new file mode 100644
index 000..8253eac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ25.C
@@ -0,0 +1,8 @@
+// { dg-do compile { target c++14 } }
+
+using fl = float;
+
+templateclass T const int V = 0;
+template const int Vfl = 42;
+
+static_assert(Vfloat == 42, );
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ26.C b/gcc/testsuite/g++.dg/cpp1y/var-templ26.C
new file mode 100644
index 000..9ac0777
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ26.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++14 } }
+
+template class T const int V = 0;
+template  const int Vchar = 42;
+
+template class T
+struct A
+{
+  using N = T;
+};
+
+#define SA(X) static_assert((X),#X)
+template class T
+struct B
+{
+  SA(Vtypename AT::N == 42);
+};
+
+Bchar b;
commit c1e703a6f795ea63609098607b4484e4b516af41
Author: Jason Merrill ja...@redhat.com
Date:   Wed May 13 15:57:00 2015 -0400

	* pt.c (retrieve_specialization): Make sure our arguments have
	gone through strip_typedefs.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2166f5f..60f3958 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1058,6 +1058,14 @@ retrieve_specialization (tree tmpl, tree args, hashval_t hash)
 		  ? TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
 		  : template_class_depth (DECL_CONTEXT (tmpl;
 
+#ifdef ENABLE_CHECKING
+  /* We should have gone through coerce_template_parms by now.  */
+  ++processing_template_decl;
+  if (!any_dependent_template_arguments_p (args))
+gcc_assert (strip_typedefs_expr (args, NULL) == args);
+  --processing_template_decl;
+#endif
+
   if (optimize_specialization_lookup_p (tmpl))
 {
   tree class_template;


[7/9] Record the number of registers in a REG

2015-05-18 Thread Richard Sandiford
This is the main patch, to record REG_NREGS in the REG itself.
The END_REGNO/END_HARD_REGNO distinction goes away in the next patch.


gcc/
* rtl.h (reg_info): Add an nregs field.
(REG_NREGS): Use it.
(SET_REGNO_RAW): Delete.
(set_regno_raw): New function.
* regs.h (END_HARD_REGNO): Make equivalent to END_REGNO.
(END_REGNO): Redefine in terms of REG_NREGS.
* read-rtl.c (read_rtx_code): Call set_regno_raw instead of
SET_REGNO_RAW.
* emit-rtl.c (set_mode_and_regno): Likewise.
* df-scan.c (df_ref_change_reg_with_loc): Use set_mode_and_regno
instead of SET_REGNO_RAW.

Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-18 08:34:38.532523211 +0100
+++ gcc/rtl.h   2015-05-18 08:36:23.407245048 +0100
@@ -210,7 +210,9 @@ struct GTY(()) reg_info {
   /* The value of REGNO.  */
   unsigned int regno;
 
-  unsigned int unused : 32;
+  /* The value of REG_NREGS.  */
+  unsigned int nregs : 8;
+  unsigned int unused : 24;
 
   /* The value of REG_ATTRS.  */
   reg_attrs *attrs;
@@ -1712,15 +1714,11 @@ #define LABEL_REF_LABEL(LABREF) XCEXP (L
be used on RHS.  Use SET_REGNO to change the value.  */
 #define REGNO(RTX) (rhs_regno(RTX))
 #define SET_REGNO(RTX, N) (df_ref_change_reg_with_loc (RTX, N))
-#define SET_REGNO_RAW(RTX, N) (REG_CHECK (RTX)-regno = N)
 
 /* Return the number of consecutive registers in a REG.  This is always
1 for pseudo registers and is determined by HARD_REGNO_NREGS for
hard registers.  */
-#define REG_NREGS(RTX) \
-  (REGNO (RTX)  FIRST_PSEUDO_REGISTER \
-   ? (unsigned int) hard_regno_nregs[REGNO (RTX)][GET_MODE (RTX)] \
-   : 1)
+#define REG_NREGS(RTX) (REG_CHECK (RTX)-nregs)
 
 /* ORIGINAL_REGNO holds the number the register originally had; for a
pseudo register turned into a hard reg this will hold the old pseudo
@@ -1735,6 +1733,15 @@ rhs_regno (const_rtx x)
   return REG_CHECK (x)-regno;
 }
 
+/* Change the REGNO and REG_NREGS of REG X to the specified values,
+   bypassing the df machinery.  */
+static inline void
+set_regno_raw (rtx x, unsigned int regno, unsigned int nregs)
+{
+  reg_info *reg = REG_CHECK (x);
+  reg-regno = regno;
+  reg-nregs = nregs;
+}
 
 /* 1 if RTX is a reg or parallel that is the current function's return
value.  */
Index: gcc/regs.h
===
--- gcc/regs.h  2015-05-18 08:34:38.532523211 +0100
+++ gcc/regs.h  2015-05-18 08:36:11.0 +0100
@@ -288,11 +288,11 @@ end_hard_regno (machine_mode mode, unsig
 
 /* Likewise for hard register X.  */
 
-#define END_HARD_REGNO(X) end_hard_regno (GET_MODE (X), REGNO (X))
+#define END_HARD_REGNO(X) END_REGNO (X)
 
 /* Likewise for hard or pseudo register X.  */
 
-#define END_REGNO(X) (HARD_REGISTER_P (X) ? END_HARD_REGNO (X) : REGNO (X) + 1)
+#define END_REGNO(X) (REGNO (X) + REG_NREGS (X))
 
 /* Add to REGS all the registers required to store a value of mode MODE
in register REGNO.  */
Index: gcc/read-rtl.c
===
--- gcc/read-rtl.c  2015-05-18 08:34:38.532523211 +0100
+++ gcc/read-rtl.c  2015-05-18 08:34:38.532523211 +0100
@@ -1349,7 +1349,7 @@ read_rtx_code (const char *code_name)
   case 'r':
read_name (name);
validate_const_int (name.string);
-   SET_REGNO_RAW (return_rtx, atoi (name.string));
+   set_regno_raw (return_rtx, atoi (name.string), 1);
REG_ATTRS (return_rtx) = NULL;
break;
 
Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2015-05-18 08:34:38.532523211 +0100
+++ gcc/emit-rtl.c  2015-05-18 08:34:38.532523211 +0100
@@ -435,8 +435,11 @@ gen_blockage (void)
 void
 set_mode_and_regno (rtx x, machine_mode mode, unsigned int regno)
 {
+  unsigned int nregs = (HARD_REGISTER_NUM_P (regno)
+   ? hard_regno_nregs[regno][mode]
+   : 1);
   PUT_MODE_RAW (x, mode);
-  SET_REGNO_RAW (x, regno);
+  set_regno_raw (x, regno, nregs);
 }
 
 /* Generate a new REG rtx.  Make sure ORIGINAL_REGNO is set properly, and
Index: gcc/df-scan.c
===
--- gcc/df-scan.c   2015-05-18 08:34:38.532523211 +0100
+++ gcc/df-scan.c   2015-05-18 08:36:11.0 +0100
@@ -1930,7 +1930,7 @@ df_ref_change_reg_with_loc (rtx loc, uns
DF_REG_EQ_USE_GET (new_regno),
new_regno, loc);
 }
-  SET_REGNO_RAW (loc, new_regno);
+  set_mode_and_regno (loc, GET_MODE (loc), new_regno);
 }
 
 



[6/9] Pass REG changes through a new function

2015-05-18 Thread Richard Sandiford
This patch adds a new function to set both the mode and regno
of a REG.  It makes sure that all REG PUT_MODE and SET_REGNO changes
go through this function.  There's a new PUT_MODE_RAW (analogous
to SET_REGNO_RAW) for the cases where the caller doesn't want that.

There's a small consistency fix: gen_rtx_REG was declared with
unsigned and defined with unsigned int.  The latter is usual
GCC style.


gcc/
* rtl.h (PUT_MODE_RAW): New macro.
(PUT_REG_NOTE_KIND): Use it.
(set_mode_and_regno): Declare.
(gen_raw_REG): Change regno to unsigned int.
(gen_rtx_REG): Change unsigned to unsigned int.
(PUT_MODE): Forward to PUT_MODE_RAW for generators, otherwise
use set_mode_and_regno to change the mode of registers.
* gengenrtl.c (gendef): Use PUT_MODE_RAW.
* emit-rtl.c (set_mode_and_regno): New function.
(gen_raw_REG): Change regno to unsigned int.  Use set_mode_and_regno.
* caller-save.c (reg_save_code): Use set_mode_and_regno.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload.c (reload_cse_simplify_operands): Likewise.

Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-17 21:26:44.375213705 +0100
+++ gcc/rtl.h   2015-05-17 21:29:52.068986873 +0100
@@ -668,8 +668,8 @@ #define RTX_PREV(X) ((INSN_P (X)
 #define GET_CODE(RTX)  ((enum rtx_code) (RTX)-code)
 #define PUT_CODE(RTX, CODE) ((RTX)-code = (CODE))
 
-#define GET_MODE(RTX)  ((machine_mode) (RTX)-mode)
-#define PUT_MODE(RTX, MODE) ((RTX)-mode = (MODE))
+#define GET_MODE(RTX)  ((machine_mode) (RTX)-mode)
+#define PUT_MODE_RAW(RTX, MODE)((RTX)-mode = (MODE))
 
 /* RTL vector.  These appear inside RTX's when there is a need
for a variable number of things.  The principle use is inside
@@ -1509,7 +1509,7 @@ #define DEF_REG_NOTE(NAME) NAME,
 /* Define macros to extract and insert the reg-note kind in an EXPR_LIST.  */
 #define REG_NOTE_KIND(LINK) ((enum reg_note) GET_MODE (LINK))
 #define PUT_REG_NOTE_KIND(LINK, KIND) \
-  PUT_MODE (LINK, (machine_mode) (KIND))
+  PUT_MODE_RAW (LINK, (machine_mode) (KIND))
 
 /* Names for REG_NOTE's in EXPR_LIST insn's.  */
 
@@ -3216,13 +3216,27 @@ gen_rtx_INSN (machine_mode mode, rtx_ins
  rtx reg_notes);
 extern rtx gen_rtx_CONST_INT (machine_mode, HOST_WIDE_INT);
 extern rtx gen_rtx_CONST_VECTOR (machine_mode, rtvec);
-extern rtx gen_raw_REG (machine_mode, int);
-extern rtx gen_rtx_REG (machine_mode, unsigned);
+extern void set_mode_and_regno (rtx, machine_mode, unsigned int);
+extern rtx gen_raw_REG (machine_mode, unsigned int);
+extern rtx gen_rtx_REG (machine_mode, unsigned int);
 extern rtx gen_rtx_SUBREG (machine_mode, rtx, int);
 extern rtx gen_rtx_MEM (machine_mode, rtx);
 extern rtx gen_rtx_VAR_LOCATION (machine_mode, tree, rtx,
 enum var_init_status);
 
+#ifdef GENERATOR_FILE
+#define PUT_MODE(RTX, MODE) PUT_MODE_RAW (RTX, MODE)
+#else
+static inline void
+PUT_MODE (rtx x, machine_mode mode)
+{
+  if (REG_P (x))
+set_mode_and_regno (x, mode, REGNO (x));
+  else
+PUT_MODE_RAW (x, mode);
+}
+#endif
+
 #define GEN_INT(N)  gen_rtx_CONST_INT (VOIDmode, (N))
 
 /* Virtual registers are used during RTL generation to refer to locations into
Index: gcc/gengenrtl.c
===
--- gcc/gengenrtl.c 2015-05-15 21:12:56.172234508 +0100
+++ gcc/gengenrtl.c 2015-05-17 21:31:16.143988041 +0100
@@ -252,7 +252,7 @@ gendef (const char *format)
   puts (  rtx rt;);
   puts (  rt = rtx_alloc_stat (code PASS_MEM_STAT);\n);
 
-  puts (  PUT_MODE (rt, mode););
+  puts (  PUT_MODE_RAW (rt, mode););
 
   for (p = format, i = j = 0; *p ; ++p, ++i)
 if (*p != '0')
Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2015-05-17 21:26:44.375213705 +0100
+++ gcc/emit-rtl.c  2015-05-17 21:30:18.484672892 +0100
@@ -430,16 +430,24 @@ gen_blockage (void)
 #endif
 
 
+/* Set the mode and register number of X to MODE and REGNO.  */
+
+void
+set_mode_and_regno (rtx x, machine_mode mode, unsigned int regno)
+{
+  PUT_MODE_RAW (x, mode);
+  SET_REGNO_RAW (x, regno);
+}
+
 /* Generate a new REG rtx.  Make sure ORIGINAL_REGNO is set properly, and
don't attempt to share with the various global pieces of rtl (such as
frame_pointer_rtx).  */
 
 rtx
-gen_raw_REG (machine_mode mode, int regno)
+gen_raw_REG (machine_mode mode, unsigned int regno)
 {
   rtx x = rtx_alloc_stat (REG PASS_MEM_STAT);
-  PUT_MODE (x, mode);
-  SET_REGNO_RAW (x, regno);
+  set_mode_and_regno (x, mode, regno);
   REG_ATTRS (x) = NULL;
   ORIGINAL_REGNO (x) = regno;
   return x;
Index: gcc/caller-save.c
===
--- gcc/caller-save.c   2015-05-17 21:26:44.375213705 +0100
+++ gcc/caller-save.c   

Re: [PATCH][Testsuite] Disable tests with dg-require-fork for simulated targets

2015-05-18 Thread Mike Stump
On May 18, 2015, at 8:01 AM, Alan Lawrence alan.lawre...@arm.com wrote:
 Simulators such as qemu report the presence of fork (it's in glibc) but 
 generally do not support synchronization primitives between threads, so any 
 tests using fork are unreliable.

Hum, I have a simulator (binutils/sim) that has fork.  All those tests pass for 
me. They seem to be reliable for me.  I didn’t do anything special as I recall. 
 ?

I did add enough libc (aka newlib) to bootstrap gcc, which maybe is slightly 
more than some do, but, existence of additional libraries shouldn’t change it 
much.  To the extent it does, it should be easy to notice any extra required 
libraries directly. 

If a qmu bug or design deficiency, do you have a pointer to the reported bug or 
the design where they talk about tit.

Remember, the point of the test suite is to find bugs to be fixed.  Papering 
over bugs by turning it off, is fine, but, we should have named bug reports 
that when fixed, cause us to go back and turn back on those that were turned 
off.

 This patch disables the subset of such tests that identify themselves using 
 dg-require-fork.
 
 At present, such tests are limited to (a) gcc.dg/torture/ftrapv-1.c. and (b) 
 some tests in the 27_io section of the libstdc++ testsuite, listed below. 
 Further patches can add dg-require-fork to the many other tests that call 
 fork().
 
 Cross-tested on aarch64-none-linux-gnu using qemu, with these tests becoming 
 UNSUPPORTED:
 
 (gcc)
 gcc.dg/torture/ftrapv-1.c

So, I reviewed this test case.  What about it doesn’t work?  Kinda simple and 
small, easy to understand.

 Is this patch OK for trunk?

No.  Let’s talk about it before turning off a to of test cases.

[8/9] Remove END_HARD_REGNO

2015-05-18 Thread Richard Sandiford
This patch replaces calls to END_HARD_REGNO with calls to END_REGNO and
moves END_REGNO to rtl.h (since it's now just a REG access).


gcc/
* regs.h (END_HARD_REGNO): Delete.
(END_REGNO): Move to...
* rtl.h: ...here.
* bt-load.c (note_btr_set): Use END_REGNO instead of END_HARD_REGNO.
* caller-save.c (mark_set_regs): Likewise.
* combine.c (move_deaths, distribute_notes): Likewise.
* cse.c (invalidate, invalidate_for_call): Likewise.
* df-scan.c (df_ref_record): Likewise.
* postreload-gcse.c (reg_changed_after_insn_p): Likewise.
(record_last_reg_set_info): Likewise.
* reg-stack.c (convert_regs_exit): Likewise.
* reload.c (reg_overlap_mentioned_for_reload_p): Likewise.
* resource.c (update_live_status): Likewise.
* rtlanal.c (find_reg_fusage, find_regno_fusage): Likewise.

Index: gcc/regs.h
===
--- gcc/regs.h  2015-05-18 08:36:41.951019042 +0100
+++ gcc/regs.h  2015-05-18 08:36:41.947019063 +0100
@@ -286,14 +286,6 @@ end_hard_regno (machine_mode mode, unsig
   return regno + hard_regno_nregs[regno][(int) mode];
 }
 
-/* Likewise for hard register X.  */
-
-#define END_HARD_REGNO(X) END_REGNO (X)
-
-/* Likewise for hard or pseudo register X.  */
-
-#define END_REGNO(X) (REGNO (X) + REG_NREGS (X))
-
 /* Add to REGS all the registers required to store a value of mode MODE
in register REGNO.  */
 
Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-18 08:36:41.951019042 +0100
+++ gcc/rtl.h   2015-05-18 08:37:14.494621384 +0100
@@ -1733,6 +1733,13 @@ rhs_regno (const_rtx x)
   return REG_CHECK (x)-regno;
 }
 
+/* Return the final register in REG X plus one.  */
+static inline unsigned int
+END_REGNO (const_rtx x)
+{
+  return REGNO (x) + REG_NREGS (x);
+}
+
 /* Change the REGNO and REG_NREGS of REG X to the specified values,
bypassing the df machinery.  */
 static inline void
Index: gcc/bt-load.c
===
--- gcc/bt-load.c   2015-05-18 08:36:41.951019042 +0100
+++ gcc/bt-load.c   2015-05-18 08:36:41.943019110 +0100
@@ -443,7 +443,7 @@ note_btr_set (rtx dest, const_rtx set AT
   if (!REG_P (dest))
 return;
   regno = REGNO (dest);
-  end_regno = END_HARD_REGNO (dest);
+  end_regno = END_REGNO (dest);
   for (; regno  end_regno; regno++)
 if (TEST_HARD_REG_BIT (all_btrs, regno))
   {
Index: gcc/caller-save.c
===
--- gcc/caller-save.c   2015-05-18 08:36:41.951019042 +0100
+++ gcc/caller-save.c   2015-05-18 08:36:41.947019063 +0100
@@ -992,7 +992,7 @@ mark_set_regs (rtx reg, const_rtx setter
REGNO (reg)  FIRST_PSEUDO_REGISTER)
 {
   regno = REGNO (reg);
-  endregno = END_HARD_REGNO (reg);
+  endregno = END_REGNO (reg);
 }
   else
 return;
Index: gcc/combine.c
===
--- gcc/combine.c   2015-05-18 08:36:41.951019042 +0100
+++ gcc/combine.c   2015-05-18 08:36:41.947019063 +0100
@@ -13316,8 +13316,8 @@ move_deaths (rtx x, rtx maybe_kill_insn,
   GET_MODE_SIZE (GET_MODE (x
{
  unsigned int deadregno = REGNO (XEXP (note, 0));
- unsigned int deadend = END_HARD_REGNO (XEXP (note, 0));
- unsigned int ourend = END_HARD_REGNO (x);
+ unsigned int deadend = END_REGNO (XEXP (note, 0));
+ unsigned int ourend = END_REGNO (x);
  unsigned int i;
 
  for (i = deadregno; i  deadend; i++)
@@ -13337,7 +13337,7 @@ move_deaths (rtx x, rtx maybe_kill_insn,
regno  FIRST_PSEUDO_REGISTER
REG_NREGS (x)  1)
{
- unsigned int ourend = END_HARD_REGNO (x);
+ unsigned int ourend = END_REGNO (x);
  unsigned int i, offset;
  rtx oldnotes = 0;
 
@@ -13932,7 +13932,7 @@ distribute_notes (rtx notes, rtx_insn *f
 
  if (place  REG_NREGS (XEXP (note, 0))  1)
{
- unsigned int endregno = END_HARD_REGNO (XEXP (note, 0));
+ unsigned int endregno = END_REGNO (XEXP (note, 0));
  bool all_used = true;
  unsigned int i;
 
Index: gcc/cse.c
===
--- gcc/cse.c   2015-05-18 08:36:41.951019042 +0100
+++ gcc/cse.c   2015-05-18 08:36:41.947019063 +0100
@@ -1894,7 +1894,7 @@ invalidate (rtx x, machine_mode full_mod
  {
HOST_WIDE_INT in_table
  = TEST_HARD_REG_BIT (hard_regs_in_table, regno);
-   unsigned int endregno = END_HARD_REGNO (x);
+   unsigned int endregno = END_REGNO (x);
unsigned int tregno, tendregno, rn;
struct table_elt *p, *next;
 
@@ -1920,7 

[2/9] Add a REG_NREGS macro

2015-05-18 Thread Richard Sandiford
This patch adds a REG_NREGS macro that by the end of the series
will simply be an rtx field access.  The definition in this patch
is just a placeholder (and so I haven't tried to avoid the double
evaluation of the parameter).

The diff has extra context because in most cases it makes it obvious
that we really are looking at hard_regno_nregs[REGNO (x)][GET_MODE (x)]
for some rtx x.  There are a couple of sites where it's more indirect
though.


gcc/
* rtl.h (REG_NREGS): New macro
* alias.c (record_set): Use it.
* cfgcleanup.c (mark_effect): Likewise.
* combine.c (likely_spilled_retval_1): Likewise.
(likely_spilled_retval_p, can_change_dest_mode): Likewise.
(move_deaths, distribute_notes): Likewise.
* cselib.c (cselib_record_set): Likewise.
* df-problems.c (df_simulate_one_insn_forwards): Likewise.
* df-scan.c (df_mark_reg): Likewise.
* dse.c (look_for_hardregs): Likewise.
* dwarf2out.c (reg_loc_descriptor): Likewise.
(multiple_reg_loc_descriptor): Likewise.
* expr.c (write_complex_part, read_complex_part): Likewise.
(emit_move_complex): Likewise.
* haifa-sched.c (setup_ref_regs): Likewise.
* ira-lives.c (mark_hard_reg_live): Likewise.
* lra.c (lra_set_insn_recog_data): Likewise.
* mode-switching.c (create_pre_exit): Likewise.
* postreload.c (reload_combine_recognize_const_pattern): Likewise.
(reload_combine_recognize_pattern): Likewise.
(reload_combine_note_use, move2add_record_mode): Likewise.
(reload_cse_move2add): Likewise.
* reg-stack.c (subst_stack_regs_pat): Likewise.
* regcprop.c (kill_value, copy_value): Likewise.
(copyprop_hardreg_forward_1): Likewise.
* regrename.c (verify_reg_in_set, scan_rtx_reg): Likewise.
(build_def_use): Likewise.
* sched-deps.c (mark_insn_reg_birth, mark_reg_death): Likewise.
(deps_analyze_insn): Likewise.
* sched-rgn.c (check_live_1, update_live_1): Likewise.
* sel-sched.c (count_occurrences_equiv): Likewise.
* valtrack.c (dead_debug_insert_temp): Likewise.

Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-18 08:19:43.475431381 +0100
+++ gcc/rtl.h   2015-05-18 08:19:43.459431568 +0100
@@ -1692,16 +1692,24 @@ #define LABEL_REF_LABEL(LABREF) XCEXP (L
 
 /* For a REG rtx, REGNO extracts the register number.  REGNO can only
be used on RHS.  Use SET_REGNO to change the value.  */
 #define REGNO(RTX) (rhs_regno(RTX))
 #define SET_REGNO(RTX,N) \
   (df_ref_change_reg_with_loc (REGNO (RTX), N, RTX), XCUINT (RTX, 0, REG) = N)
 #define SET_REGNO_RAW(RTX,N) (XCUINT (RTX, 0, REG) = N)
 
+/* Return the number of consecutive registers in a REG.  This is always
+   1 for pseudo registers and is determined by HARD_REGNO_NREGS for
+   hard registers.  */
+#define REG_NREGS(RTX) \
+  (REGNO (RTX)  FIRST_PSEUDO_REGISTER \
+   ? (unsigned int) hard_regno_nregs[REGNO (RTX)][GET_MODE (RTX)] \
+   : 1)
+
 /* ORIGINAL_REGNO holds the number the register originally had; for a
pseudo register turned into a hard reg this will hold the old pseudo
register number.  */
 #define ORIGINAL_REGNO(RTX) \
   (RTL_FLAG_CHECK1 (ORIGINAL_REGNO, (RTX), REG)-u2.original_regno)
 
 /* Force the REGNO macro to only be used on the lhs.  */
 static inline unsigned int
Index: gcc/alias.c
===
--- gcc/alias.c 2015-05-18 08:19:43.475431381 +0100
+++ gcc/alias.c 2015-05-18 08:19:43.455431618 +0100
@@ -1295,22 +1295,17 @@ record_set (rtx dest, const_rtx set, voi
 
   if (!REG_P (dest))
 return;
 
   regno = REGNO (dest);
 
   gcc_checking_assert (regno  reg_base_value-length ());
 
-  /* If this spans multiple hard registers, then we must indicate that every
- register has an unusable value.  */
-  if (regno  FIRST_PSEUDO_REGISTER)
-n = hard_regno_nregs[regno][GET_MODE (dest)];
-  else
-n = 1;
+  n = REG_NREGS (dest);
   if (n != 1)
 {
   while (--n = 0)
{
  bitmap_set_bit (reg_seen, regno + n);
  new_reg_base_value[regno + n] = 0;
}
   return;
Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c2015-05-18 08:19:43.475431381 +0100
+++ gcc/cfgcleanup.c2015-05-18 08:19:43.455431618 +0100
@@ -229,35 +229,33 @@ mark_effect (rtx exp, regset nonequal)
   /* In case we do clobber the register, mark it as equal, as we know the
 value is dead so it don't have to match.  */
 case CLOBBER:
   if (REG_P (XEXP (exp, 0)))
{
  dest = XEXP (exp, 0);
  regno = REGNO (dest);
  if (HARD_REGISTER_NUM_P (regno))
-   bitmap_clear_range (nonequal, regno,
-   hard_regno_nregs[regno][GET_MODE (dest)]);
+   bitmap_clear_range (nonequal, regno, 

[3/9] Clean up df_ref_change_reg_with_loc

2015-05-18 Thread Richard Sandiford
This patch cleans up the interface to df_ref_change_reg_with_loc.
The function is used only by SET_REGNO_RAW, so the old regno is
always REGNO (x) and the caller always goes on to set REGNO.
(And the fuction doesn't make much sense otherwise.)

The patch therefore gets df_ref_change_reg_with_loc to work out the old
regno itself and to install the new register number once it's done.

The check for the old register being -1 was redundant.  Only expr.c
and postreload.c create -1 registers (fixed in a later patch).
Both sites are just creating temporary registers in order to query
backend hooks and neither site needs the check.  expr.c does use
SET_REGNO and so does go through this function, but only at a time
when there's no df information.  (And it wouldn't work if the df
machinery were set up, since any change after the first would
look like a normal change.)  postreload.c uses SET_REGNO_RAW
and so bypasses the code altogether.

expr.c bypasses the code too by the end of the series.


gcc/
* df.h (df_ref_change_reg_with_loc): Remove old_regno parameter.
Change type of new_regno to unsigned int.
* df-scan.c (df_ref_change_reg_with_loc_1): Change type of
new_regno to unsigned int.
(df_ref_change_reg_with_loc): Remove old_regno parameter.
Change type of new_regno to unsigned int.  Use SET_REGNO_RAW.
* rtl.h (SET_REGNO): Update call to df_ref_change_reg_with_loc.
(SET_REGNO_RAW): Add space after ,.

Index: gcc/df.h
===
--- gcc/df.h2015-05-18 07:53:09.890871253 +0100
+++ gcc/df.h2015-05-18 07:53:09.890871253 +0100
@@ -1049,7 +1049,7 @@ extern void df_recompute_luids (basic_bl
 extern void df_insn_change_bb (rtx_insn *, basic_block);
 extern void df_maybe_reorganize_use_refs (enum df_ref_order);
 extern void df_maybe_reorganize_def_refs (enum df_ref_order);
-extern void df_ref_change_reg_with_loc (int, int, rtx);
+extern void df_ref_change_reg_with_loc (rtx, unsigned int);
 extern void df_notes_rescan (rtx_insn *);
 extern void df_hard_reg_init (void);
 extern void df_update_entry_block_defs (void);
Index: gcc/df-scan.c
===
--- gcc/df-scan.c   2015-05-18 07:53:09.890871253 +0100
+++ gcc/df-scan.c   2015-05-18 07:53:09.890871253 +0100
@@ -1819,7 +1819,7 @@ df_insn_change_bb (rtx_insn *insn, basic
 static void
 df_ref_change_reg_with_loc_1 (struct df_reg_info *old_df,
  struct df_reg_info *new_df,
- int new_regno, rtx loc)
+ unsigned int new_regno, rtx loc)
 {
   df_ref the_ref = old_df-reg_chain;
 
@@ -1904,25 +1904,33 @@ df_ref_change_reg_with_loc_1 (struct df_
 }
 
 
-/* Change the regno of all refs that contained LOC from OLD_REGNO to
-   NEW_REGNO.  Refs that do not match LOC are not changed which means
-   that artificial refs are not changed since they have no loc.  This
-   call is to support the SET_REGNO macro. */
+/* Change the regno of register LOC to NEW_REGNO and update the df
+   information accordingly.  Refs that do not match LOC are not changed
+   which means that artificial refs are not changed since they have no loc.
+   This call is to support the SET_REGNO macro. */
 
 void
-df_ref_change_reg_with_loc (int old_regno, int new_regno, rtx loc)
+df_ref_change_reg_with_loc (rtx loc, unsigned int new_regno)
 {
-  if ((!df) || (old_regno == -1) || (old_regno == new_regno))
+  unsigned int old_regno = REGNO (loc);
+  if (old_regno == new_regno)
 return;
 
-  df_grow_reg_info ();
+  if (df)
+{
+  df_grow_reg_info ();
 
-  df_ref_change_reg_with_loc_1 (DF_REG_DEF_GET (old_regno),
-   DF_REG_DEF_GET (new_regno), new_regno, loc);
-  df_ref_change_reg_with_loc_1 (DF_REG_USE_GET (old_regno),
-   DF_REG_USE_GET (new_regno), new_regno, loc);
-  df_ref_change_reg_with_loc_1 (DF_REG_EQ_USE_GET (old_regno),
-   DF_REG_EQ_USE_GET (new_regno), new_regno, loc);
+  df_ref_change_reg_with_loc_1 (DF_REG_DEF_GET (old_regno),
+   DF_REG_DEF_GET (new_regno),
+   new_regno, loc);
+  df_ref_change_reg_with_loc_1 (DF_REG_USE_GET (old_regno),
+   DF_REG_USE_GET (new_regno),
+   new_regno, loc);
+  df_ref_change_reg_with_loc_1 (DF_REG_EQ_USE_GET (old_regno),
+   DF_REG_EQ_USE_GET (new_regno),
+   new_regno, loc);
+}
+  SET_REGNO_RAW (loc, new_regno);
 }
 
 
Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-18 07:53:09.890871253 +0100
+++ gcc/rtl.h   2015-05-18 07:53:09.886871176 +0100
@@ -1693,9 +1693,8 @@ #define LABEL_REF_LABEL(LABREF) XCEXP (L
 /* For a REG rtx, REGNO extracts 

[4/9] Add a dedicated rtx union member for REGs

2015-05-18 Thread Richard Sandiford
This patch replaces the current REG i0 format with a dedicated structure,
so that we can make use of the extra 32 bits in the i field.

Of the places that iterate on formats and do something for 'i's,
most already handled REGs specially before the format walk and
so don't need to check for 'r'.  Otherwise it's mostly just a case
of adding dummy 'r' cases in order avoid the default gcc_unreachable ()
(in cases where we do the same for 'i').  The main exceptions are the
cselib.c and lra-constraints.c changes.

final.c:leaf_renumber_regs_insn handled REGs specially but then
went on to do a no-op walk of the format.  I just added an early
exit instead of an empty 'r' case.


gcc/
* rtl.def (REG): Change format to r.
* rtl.h (rtunion): Remove rt_reg.
(reg_info): New structure.
(rtx_def): Add reg field to main union.
(X0REGATTR): Delete.
(REG_CHECK): New macro.
(SET_REGNO_RAW, rhs_regno, REG_ATTRS): Use it.
* rtl.c (rtx_format): Document r.
(rtx_code_size): Handle REG specially.
* gengenrtl.c (special_format): Return true for formats
that include 'r'.
* gengtype.c (adjust_field_rtx_def): Handle 'r' fields.
Deal with REG_ATTRS after the field loop.
* emit-rtl.c (gen_raw_REG): Call rtx_alloc_stat directly.
* expmed.c (init_expmed): Call gen_raw_REG instead of
gen_rtx_raw_REG.
* expr.c (init_expr_target): Likewise.
* regcprop.c (maybe_mode_change): Likewise.
* varasm.c (make_decl_rtl): Likewise.
* final.c (leaf_renumber_regs_insn): Return early after
handling REGs.
* genemit.c (gen_exp): Handle 'r' fields.
* genpeep.c (match_rtx): Likewise.
* gensupport.c (subst_pattern_match): Likewise.
(get_alternatives_number, collect_insn_data, alter_predicate_for_insn)
(alter_constraints, subst_dup): Likewise.
* read-rtl.c (read_rtx_code): Likewise.
* print-rtl.c (print_rtx): Likewise.
* genrecog.c (find_operand, find_matching_operand): Likewise.
(validate_pattern, match_pattern_2): Likewise.
(parameter::UINT, rtx_test::REGNO_FIELD): New enum values.
(rtx_test::regno_field): New function.
(operator ==, safe_to_hoist_p, transition_parameter_type)
(parameter_type_string, print_parameter_value)
(print_nonbool_test, print_test): Handle new enum values.
* cselib.c (rtx_equal_for_cselib_1): Handle REG specially.
* lra-constraints.c (operands_match_p): Likewise.

Index: gcc/rtl.def
===
--- gcc/rtl.def 2015-05-18 07:53:15.014808321 +0100
+++ gcc/rtl.def 2015-05-18 07:53:15.010808427 +0100
@@ -381,7 +381,7 @@ DEF_RTL_EXPR(PC, pc, , RTX_OBJ)
points to a reg_attrs structure.
This rtx needs to have as many (or more) fields as a MEM, since we
can change REG rtx's into MEMs during reload.  */
-DEF_RTL_EXPR(REG, reg, i0, RTX_OBJ)
+DEF_RTL_EXPR(REG, reg, r, RTX_OBJ)
 
 /* A scratch register.  This represents a register used only within a
single insn.  It will be replaced by a REG during register allocation
Index: gcc/rtl.h
===
--- gcc/rtl.h   2015-05-18 07:53:15.014808321 +0100
+++ gcc/rtl.h   2015-05-18 07:53:15.014808321 +0100
@@ -201,11 +201,21 @@ struct GTY((for_user)) reg_attrs {
   tree rt_tree;
   basic_block rt_bb;
   mem_attrs *rt_mem;
-  reg_attrs *rt_reg;
   struct constant_descriptor_rtx *rt_constant;
   struct dw_cfi_node *rt_cfi;
 };
 
+/* Describes the properties of a REG.  */
+struct GTY(()) reg_info {
+  /* The value of REGNO.  */
+  unsigned int regno;
+
+  unsigned int unused : 32;
+
+  /* The value of REG_ATTRS.  */
+  reg_attrs *attrs;
+};
+
 /* This structure remembers the position of a SYMBOL_REF within an
object_block structure.  A SYMBOL_REF only provides this information
if SYMBOL_REF_HAS_BLOCK_INFO_P is true.  */
@@ -395,6 +405,7 @@ struct GTY((desc(0), tag(0),
   union u {
 rtunion fld[1];
 HOST_WIDE_INT hwint[1];
+struct reg_info reg;
 struct block_symbol block_sym;
 struct real_value rv;
 struct fixed_value fv;
@@ -1070,6 +1081,13 @@ #define XCNMPFV(RTX, C, M) __extension__
 __LINE__, __FUNCTION__);   \
_rtx-u.fv; })
 
+#define REG_CHECK(RTX) __extension__   \
+({ __typeof (RTX) const _rtx = (RTX);  \
+   if (GET_CODE (_rtx) != REG) \
+ rtl_check_failed_code1 (_rtx, REG,  __FILE__, __LINE__,   \
+__FUNCTION__); \
+   _rtx-u.reg; })
+
 #define BLOCK_SYMBOL_CHECK(RTX) __extension__  \
 ({ __typeof (RTX) const _symbol = (RTX);   \
const unsigned int flags = SYMBOL_REF_FLAGS 

[5/9] Create sensible dummy registers

2015-05-18 Thread Richard Sandiford
Some pieces of code create a temporary REG or MEM and only fill it
in later when they're testing the cost of a particular rtx.  This patch
makes sure that even the dummy REG or MEM is valid, rather than force
the gen_* code to handle garbage values.


gcc/
* caller-save.c (init_caller_save): Use word_mode and
FIRST_PSEUDO_REGISTER when creating temporary rtxes.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload.c (reload_cse_regs_1): Likewise.

Index: gcc/caller-save.c
===
--- gcc/caller-save.c   2015-05-16 15:50:58.063567104 +0100
+++ gcc/caller-save.c   2015-05-16 15:50:58.055567198 +0100
@@ -287,8 +287,8 @@ init_caller_save (void)
  To avoid lots of unnecessary RTL allocation, we construct all the RTL
  once, then modify the memory and register operands in-place.  */
 
-  test_reg = gen_rtx_REG (VOIDmode, 0);
-  test_mem = gen_rtx_MEM (VOIDmode, address);
+  test_reg = gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER);
+  test_mem = gen_rtx_MEM (word_mode, address);
   savepat = gen_rtx_SET (test_mem, test_reg);
   restpat = gen_rtx_SET (test_reg, test_mem);
 
Index: gcc/expr.c
===
--- gcc/expr.c  2015-05-16 15:50:58.063567104 +0100
+++ gcc/expr.c  2015-05-16 15:50:58.059567152 +0100
@@ -202,12 +202,12 @@ init_expr_target (void)
   /* Try indexing by frame ptr and try by stack ptr.
  It is known that on the Convex the stack ptr isn't a valid index.
  With luck, one or the other is valid on any machine.  */
-  mem = gen_rtx_MEM (VOIDmode, stack_pointer_rtx);
-  mem1 = gen_rtx_MEM (VOIDmode, frame_pointer_rtx);
+  mem = gen_rtx_MEM (word_mode, stack_pointer_rtx);
+  mem1 = gen_rtx_MEM (word_mode, frame_pointer_rtx);
 
   /* A scratch register we can modify in-place below to avoid
  useless RTL allocations.  */
-  reg = gen_rtx_REG (VOIDmode, -1);
+  reg = gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER);
 
   insn = rtx_alloc (INSN);
   pat = gen_rtx_SET (NULL_RTX, NULL_RTX);
Index: gcc/ira.c
===
--- gcc/ira.c   2015-05-16 15:50:58.063567104 +0100
+++ gcc/ira.c   2015-05-16 15:50:58.055567198 +0100
@@ -1767,8 +1767,8 @@ setup_prohibited_mode_move_regs (void)
   if (ira_prohibited_mode_move_regs_initialized_p)
 return;
   ira_prohibited_mode_move_regs_initialized_p = true;
-  test_reg1 = gen_rtx_REG (VOIDmode, 0);
-  test_reg2 = gen_rtx_REG (VOIDmode, 0);
+  test_reg1 = gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER);
+  test_reg2 = gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER);
   move_pat = gen_rtx_SET (test_reg1, test_reg2);
   move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, move_pat, 0, -1, 0);
   for (i = 0; i  NUM_MACHINE_MODES; i++)
Index: gcc/postreload.c
===
--- gcc/postreload.c2015-05-16 15:50:58.063567104 +0100
+++ gcc/postreload.c2015-05-16 15:50:58.055567198 +0100
@@ -234,7 +234,7 @@ reload_cse_regs_1 (void)
   bool cfg_changed = false;
   basic_block bb;
   rtx_insn *insn;
-  rtx testreg = gen_rtx_REG (VOIDmode, -1);
+  rtx testreg = gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER);
 
   cselib_init (CSELIB_RECORD_MEMORY);
   init_alias_analysis ();



[9/9] Simplify register bitmap operations

2015-05-18 Thread Richard Sandiford
After the previous changes, several callers are left with code like:

  if (HARD_REGISTER_P (x))
bitmap_foo_range (x, REGNO (x), REG_NREGS (x));
  else
bitmap_foo (x, REGNO (x));

These might as well now be:

  if (REG_NREGS (x)  1)
bitmap_foo_range (x, REGNO (x), REG_NREGS (x));
  else
bitmap_foo (x, REGNO (x));

since REG_NREGS is as cheap to test, and since single-register hard REGs
are the common case.  But if separating out the cases is a win -- and it
seems to be, very slightly -- then it would be better to add the shortcut
to the range functions themselves.


gcc/
* bitmap.c (bitmap_set_range): Handle count==1 specially.
(bitmap_clear_range): Likewise.
* cfgcleanup.c (mark_effect): Use bitmap_clear_range and
bitmap_set_range unconditionally.
* df-problems.c (df_simulate_one_insn_forwards): Likewise.
* df-scan.c (df_mark_reg): Likewise.
* haifa-sched.c (setup_ref_regs): Likewise.
* sched-rgn.c (update_live_1): Likewise.

Index: gcc/bitmap.c
===
--- gcc/bitmap.c2015-05-18 08:38:25.845752816 +0100
+++ gcc/bitmap.c2015-05-18 08:38:25.841752865 +0100
@@ -1212,6 +1212,12 @@ bitmap_set_range (bitmap head, unsigned
   if (!count)
 return;
 
+  if (count == 1)
+{
+  bitmap_set_bit (head, start);
+  return;
+}
+
   first_index = start / BITMAP_ELEMENT_ALL_BITS;
   end_bit_plus1 = start + count;
   last_index = (end_bit_plus1 - 1) / BITMAP_ELEMENT_ALL_BITS;
@@ -1311,6 +1317,12 @@ bitmap_clear_range (bitmap head, unsigne
   if (!count)
 return;
 
+  if (count == 1)
+{
+  bitmap_clear_bit (head, start);
+  return;
+}
+
   first_index = start / BITMAP_ELEMENT_ALL_BITS;
   end_bit_plus1 = start + count;
   last_index = (end_bit_plus1 - 1) / BITMAP_ELEMENT_ALL_BITS;
Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c2015-05-18 08:38:25.845752816 +0100
+++ gcc/cfgcleanup.c2015-05-18 08:38:25.841752865 +0100
@@ -222,22 +222,15 @@ try_simplify_condjump (basic_block cbran
 static bool
 mark_effect (rtx exp, regset nonequal)
 {
-  int regno;
   rtx dest;
   switch (GET_CODE (exp))
 {
   /* In case we do clobber the register, mark it as equal, as we know the
 value is dead so it don't have to match.  */
 case CLOBBER:
-  if (REG_P (XEXP (exp, 0)))
-   {
- dest = XEXP (exp, 0);
- regno = REGNO (dest);
- if (HARD_REGISTER_NUM_P (regno))
-   bitmap_clear_range (nonequal, regno, REG_NREGS (dest));
- else
-   bitmap_clear_bit (nonequal, regno);
-   }
+  dest = XEXP (exp, 0);
+  if (REG_P (dest))
+   bitmap_clear_range (nonequal, REGNO (dest), REG_NREGS (dest));
   return false;
 
 case SET:
@@ -248,11 +241,7 @@ mark_effect (rtx exp, regset nonequal)
return false;
   if (!REG_P (dest))
return true;
-  regno = REGNO (dest);
-  if (HARD_REGISTER_NUM_P (regno))
-   bitmap_set_range (nonequal, regno, REG_NREGS (dest));
-  else
-   bitmap_set_bit (nonequal, regno);
+  bitmap_set_range (nonequal, REGNO (dest), REG_NREGS (dest));
   return false;
 
 default:
Index: gcc/df-problems.c
===
--- gcc/df-problems.c   2015-05-18 08:38:25.845752816 +0100
+++ gcc/df-problems.c   2015-05-18 08:38:25.845752816 +0100
@@ -3574,11 +3574,7 @@ df_simulate_one_insn_forwards (basic_blo
case REG_UNUSED:
  {
rtx reg = XEXP (link, 0);
-   int regno = REGNO (reg);
-   if (HARD_REGISTER_NUM_P (regno))
- bitmap_clear_range (live, regno, REG_NREGS (reg));
-   else
- bitmap_clear_bit (live, regno);
+   bitmap_clear_range (live, REGNO (reg), REG_NREGS (reg));
  }
  break;
default:
Index: gcc/df-scan.c
===
--- gcc/df-scan.c   2015-05-18 08:38:25.845752816 +0100
+++ gcc/df-scan.c   2015-05-18 08:38:25.841752865 +0100
@@ -3518,15 +3518,7 @@ df_get_eh_block_artificial_uses (bitmap
 static void
 df_mark_reg (rtx reg, void *vset)
 {
-  bitmap set = (bitmap) vset;
-  int regno = REGNO (reg);
-
-  gcc_assert (GET_MODE (reg) != BLKmode);
-
-  if (regno  FIRST_PSEUDO_REGISTER)
-bitmap_set_range (set, regno, REG_NREGS (reg));
-  else
-bitmap_set_bit (set, regno);
+  bitmap_set_range ((bitmap) vset, REGNO (reg), REG_NREGS (reg));
 }
 
 
Index: gcc/haifa-sched.c
===
--- gcc/haifa-sched.c   2015-05-18 08:38:25.845752816 +0100
+++ gcc/haifa-sched.c   2015-05-18 08:38:25.845752816 +0100
@@ -1032,17 +1032,13 @@ initiate_reg_pressure_info (bitmap live)
 static void
 setup_ref_regs (rtx x)
 {
-  int i, j, regno;
+  int i, j;
   const RTX_CODE code = 

Re: [PATCH i386] Allow sibcalls in no-PLT PIC

2015-05-18 Thread Jan Hubicka
 
 After this message the discussion diverged in the direction of H.J.Lu's
 proposed relaxation scheme involving new type of relocations.
 
 I'm not clear if my patch is actually approved.  I'd like to point out that it
 doesn't clash with H.J.Lu's work.  It improves codegen by allowing sibcalls in
 more circumstances.

Yes, the original patch is OK.

Honza
 
 Alexander


Re: [PATCH][ARM] PR/65711: Don't pass '-dynamic-linker' when '-shared' is used

2015-05-18 Thread Ludovic Courtès
Ramana Radhakrishnan ramana@googlemail.com skribis:

 On Thu, Apr 23, 2015 at 9:29 AM, Ludovic Courtès l...@gnu.org wrote:
 As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65711.

 Patch is for both 4.8 and 4.9 (possibly 5.1 too, I haven’t checked.)


 OK for trunk. This is also ok for all release branches if no
 objections in 24 hours.

OK, thank you.

I haven’t applied for write-after-approval so perhaps you should commit
it yourself?

Ludo’.


Re: [1/9] Make more use of END_REGNO

2015-05-18 Thread Jeff Law

On 05/18/2015 12:11 PM, Richard Sandiford wrote:

This patch just handles some obvious cases where END_REGNO could be used.


gcc/
* cfgcleanup.c (mentions_nonequal_regs): Use END_REGNO.
* dse.c (note_add_store): Likewise.
* ira-lives.c (mark_hard_reg_dead): Likewise.
* loop-invariant.c (mark_reg_store): Likewise.
(mark_reg_death): Likewise.
* postreload.c (reload_combine): Likewise.
(reload_combine_note_store): Likewise.
(reload_combine_note_use): Likewise.
* recog.c (peep2_reg_dead_p): Likewise.

OK.
jeff



Fix r223258 fallout in gfortran.dg

2015-05-18 Thread Steve Kargl
I've committed the attached patch to fix testsuite
regressions casued by r223258.  The interface 
statements needed to be within the porgram unit.
 
-- 
Steve
Index: ChangeLog
===
--- ChangeLog	(revision 223308)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2015-05-18  Steven G. Kargl  ka...@gcc.gnu.org
+
+	* gfortran.dg/lto/pr41521_0.f90: Move INTERFACE statement in program
+	unit.
+	* gfortran.dg/lto/pr41576_1.f90: Ditto.
+
 2015-05-12  Andreas Tobler  andre...@gcc.gnu.org
 
 	* lib/target-supports.exp (check_effective_target_pie): Add *-*-freebsd*
Index: gfortran.dg/lto/pr41521_0.f90
===
--- gfortran.dg/lto/pr41521_0.f90	(revision 223308)
+++ gfortran.dg/lto/pr41521_0.f90	(working copy)
@@ -3,9 +3,6 @@
 program species
 integer spk(2)
 real eval(2)
-spk = 2
-call atom(1.1,spk,eval)
-end program
 interface
   subroutine atom(sol,k,eval)
 real, intent(in) :: sol
@@ -13,4 +10,7 @@ interface
 real, intent(out) :: eval(2)
   end subroutine
 end interface
+spk = 2
+call atom(1.1,spk,eval)
+end program
 
Index: gfortran.dg/lto/pr41576_1.f90
===
--- gfortran.dg/lto/pr41576_1.f90	(revision 223308)
+++ gfortran.dg/lto/pr41576_1.f90	(working copy)
@@ -1,12 +1,11 @@
 program test
   common /bar/ c, d
   integer(4) :: c, d
-  call foo
-  if (c/=1 .or. d/=2) call abort
-end program test
-
 interface
   subroutine foo()
   end subroutine
 end interface
+  call foo
+  if (c/=1 .or. d/=2) call abort
+end program test
 


[PATCH] Fix match.pd narrowing opt (PR tree-optimization/66187)

2015-05-18 Thread Jakub Jelinek
Hi!

As the testcases show, for signed types we really should use SIGNED rather
than UNSIGNED as tree_int_cst_min_precision argument, that function doesn't
really do the desired thing with UNSIGNED for negative values and with
-fwrapv we just want the narrowing cast to not lose anything from the
number.  Additionally, as for TYPE_OVERFLOW_WRAPS case we perform the +/-/ in
unsigned type,  we have to avoid all negative masks, because those surely
have the higher bits set.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For plus and unsigned type, we could improve it even further, as in that
case we could just check if the bit immediately above precision is clear
in the mask, higher bits than that are uninteresting, because the addition
will have always zeros there.  Perhaps as a follow-up?
I mean say uchar foo (uchar a, uchar b) { return (x + y)  0x72fe; }
can be done as uchar addition  0xfe, but not e.g.  0x71fe.

2015-05-18  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/66187
* match.pd ((bit_and (plus/minus (convert @0) (convert @1)) mask)):
Pass TYPE_SIGN to tree_int_cst_min_precision.  If
!TYPE_OVERFLOW_WRAPS, ensure @4 is non-negative.

* gcc.c-torture/execute/pr66187.c: New test.
* gcc.dg/pr66187-1.c: New test.
* gcc.dg/pr66187-2.c: New test.

--- gcc/match.pd.jj 2015-05-18 09:46:39.0 +0200
+++ gcc/match.pd2015-05-18 10:20:24.944475737 +0200
@@ -1115,8 +1115,10 @@ (define_operator_list CBRT BUILT_IN_CBRT
 /* The inner conversion must be a widening conversion.  */
  TYPE_PRECISION (TREE_TYPE (@2))  TYPE_PRECISION (TREE_TYPE (@0))
  types_match (@0, @1)
- (tree_int_cst_min_precision (@4, UNSIGNED)
+ (tree_int_cst_min_precision (@4, TYPE_SIGN (TREE_TYPE (@0)))
 = TYPE_PRECISION (TREE_TYPE (@0)))
+ (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+|| tree_int_cst_sgn (@4) = 0)
  single_use (@5))
   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
(with { tree ntype = TREE_TYPE (@0); }
--- gcc/testsuite/gcc.c-torture/execute/pr66187.c.jj2015-05-18 
10:53:43.953654893 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr66187.c   2015-05-18 
10:53:28.0 +0200
@@ -0,0 +1,16 @@
+/* PR tree-optimization/66187 */
+
+int a = 1, e = -1;
+short b, f;
+
+int
+main ()
+{
+  f = e;
+  int g = b  0 ? 0 : f + b;
+  if ((g  -4)  0)
+a = 0;
+  if (a)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr66187-1.c.jj 2015-05-18 10:54:30.677906029 +0200
+++ gcc/testsuite/gcc.dg/pr66187-1.c2015-05-18 12:30:17.0 +0200
@@ -0,0 +1,97 @@
+/* PR tree-optimization/66187 */
+/* { dg-do run } */
+/* { dg-options -O2 -fno-wrapv } */
+
+__attribute__((noinline, noclone)) int
+f0 (unsigned char x, unsigned char y)
+{
+  return (x + y)  0x2ff;
+}
+
+__attribute__((noinline, noclone)) int
+f1 (unsigned char x, unsigned char y)
+{
+  return (x - y)  0x2ff;
+}
+
+__attribute__((noinline, noclone)) int
+f2 (signed char x, signed char y)
+{
+  return (x + y)  -4;
+}
+
+__attribute__((noinline, noclone)) int
+f3 (signed char x, signed char y)
+{
+  return (x + y)  0xf8;
+}
+
+__attribute__((noinline, noclone)) int
+f4 (signed char x, signed char y)
+{
+  return (x + y)  0x78;
+}
+
+__attribute__((noinline, noclone)) int
+f5 (unsigned char x, unsigned char y)
+{
+  int a = x;
+  int b = y;
+  int c = a + b;
+  return c  0x2ff;
+}
+
+__attribute__((noinline, noclone)) int
+f6 (unsigned char x, unsigned char y)
+{
+  int a = x;
+  int b = y;
+  int c = a - b;
+  return c  0x2ff;
+}
+
+__attribute__((noinline, noclone)) int
+f7 (signed char x, signed char y)
+{
+  int a = x;
+  int b = y;
+  int c = a + b;
+  return c  -4;
+}
+
+__attribute__((noinline, noclone)) int
+f8 (signed char x, signed char y)
+{
+  int a = x;
+  int b = y;
+  int c = a + b;
+  return c  0xf8;
+}
+
+__attribute__((noinline, noclone)) int
+f9 (signed char x, signed char y)
+{
+  int a = x;
+  int b = y;
+  int c = a + b;
+  return c  0x78;
+}
+
+int
+main ()
+{
+  if (__SCHAR_MAX__ != 127 || sizeof (int) != 4)
+return 0;
+  if (f0 (0xff, 0xff) != 0xfe
+  || f1 (0, 1) != 0x2ff
+  || f2 (-2, 1) != -4
+  || f3 (-2, 1) != 0xf8
+  || f4 (-2, 1) != 0x78
+  || f5 (0xff, 0xff) != 0xfe
+  || f6 (0, 1) != 0x2ff
+  || f7 (-2, 1) != -4
+  || f8 (-2, 1) != 0xf8
+  || f9 (-2, 1) != 0x78)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr66187-2.c.jj 2015-05-18 12:31:39.398176973 +0200
+++ gcc/testsuite/gcc.dg/pr66187-2.c2015-05-18 12:31:46.319067151 +0200
@@ -0,0 +1,97 @@
+/* PR tree-optimization/66187 */
+/* { dg-do run } */
+/* { dg-options -O2 -fwrapv } */
+
+__attribute__((noinline, noclone)) int
+f0 (unsigned char x, unsigned char y)
+{
+  return (x + y)  0x2ff;
+}
+
+__attribute__((noinline, noclone)) int
+f1 (unsigned char x, unsigned char y)
+{
+  return (x - y)  0x2ff;
+}
+

[PATCH] plugin event for C/C++ function definitions

2015-05-18 Thread andres . tiraboschi
Hi, this patch adds two new plugin events PLUGIN_START_PARSE_FUNCTION and 
PLUGIN_FINISH_PARSE_FUNCTION. These events are invoked at start_function and 
finish_function in gcc/c/c-decl.c and gcc/cp/decl.c respectively in the C and 
C++ frontends. 
PLUGIN_START_PARSE_FUNCTION is called before parsing a function body.
PLUGIN_FINISH_PARSE_FUNCTION is called after parsing a function definition.
This patch has been implemented in gcc 5.1.0.

changelog:

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index e28a294..fcc849d 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -8235,6 +8235,7 @@ start_function (struct c_declspecs *declspecs, struct 
c_declarator *declarator,
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, true, NULL,
  attributes, NULL, NULL, DEPRECATED_NORMAL);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
 
   /* If the declarator is not suitable for a function definition,
  cause a syntax error.  */
@@ -9050,6 +9051,7 @@ finish_function (void)
  It's still in DECL_STRUCT_FUNCTION, and we'll restore it in
  tree_rest_of_compilation.  */
   set_cfun (NULL);
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, 
current_function_decl);
   current_function_decl = NULL;
 }
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c4731ae..bde92cc 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13727,6 +13727,7 @@ start_function (cp_decl_specifier_seq *declspecs,
   tree decl1;
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, 1, attrs);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
   if (decl1 == error_mark_node)
 return false;
   /* If the declarator is not suitable for a function definition,
@@ -14365,6 +14366,7 @@ finish_function (int flags)
   vec_free (deferred_mark_used_calls);
 }
 
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, fndecl);
   return fndecl;
 }
 
diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi
index c6caa19..d50f25c 100644
--- a/gcc/doc/plugins.texi
+++ b/gcc/doc/plugins.texi
@@ -174,6 +174,8 @@ Callbacks can be invoked at the following pre-determined 
events:
 @smallexample
 enum plugin_event
 @{
+  PLUGIN_START_PARSE_FUNCTION,  /* Called before parsing the body of a 
function. */
+  PLUGIN_FINISH_PARSE_FUNCTION, /* After finishing parsing a function. */
   PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager.  */
   PLUGIN_FINISH_TYPE,   /* After finishing parsing a type.  */
   PLUGIN_FINISH_DECL,   /* After finishing parsing a declaration. */
diff --git a/gcc/plugin.c b/gcc/plugin.c
index d924438..628833f 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -441,6 +441,8 @@ register_callback (const char *plugin_name,
return;
  }
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
@@ -519,6 +521,8 @@ invoke_plugin_callbacks_full (int event, void *gcc_data)
gcc_assert (event = PLUGIN_EVENT_FIRST_DYNAMIC);
gcc_assert (event  event_last);
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
diff --git a/gcc/plugin.def b/gcc/plugin.def
index 98c988a..2a7e4c2 100644
--- a/gcc/plugin.def
+++ b/gcc/plugin.def
@@ -17,6 +17,11 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
+/* Called before parsing the body of a function.  */
+DEFEVENT (PLUGIN_START_PARSE_FUNCTION)
+
+/* After finishing parsing a function. */
+DEFEVENT (PLUGIN_FINISH_PARSE_FUNCTION)
 
 /* To hook into pass manager.  */
 DEFEVENT (PLUGIN_PASS_MANAGER_SETUP)
diff --git a/gcc/testsuite/g++.dg/plugin/def-plugin-test.C 
b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
new file mode 100644
index 000..b7f2d3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
@@ -0,0 +1,13 @@
+int global = 12;
+
+int function1(void);
+
+int function2(int a) // { dg-warning Start fndef function2 }
+{
+  return function1() + a;
+} //  { dg-warning Finish fndef function2 }
+
+int function1(void) // { dg-warning Start fndef function1 }
+{
+  return global + 1;
+} //  { dg-warning Finish fndef function1 }
diff --git a/gcc/testsuite/g++.dg/plugin/def_plugin.c 
b/gcc/testsuite/g++.dg/plugin/def_plugin.c
new file mode 100644
index 000..63983c5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/plugin/def_plugin.c
@@ -0,0 +1,45 @@
+/* A plugin example that shows which function definitions are caught by 
PLUGIN_START_FUNCTION and PLUGIN_FINISH_FUNCTION */
+
+#include gcc-plugin.h
+#include stdlib.h
+#include config.h
+#include system.h
+#include coretypes.h
+#include tree.h
+#include tree-pass.h
+#include intl.h
+#include diagnostic.h
+
+int 

Re: [PATCH] Refactoring: use std::swap instead of manual swaps

2015-05-18 Thread Trevor Saunders
On Mon, May 18, 2015 at 09:07:43PM +0300, Mikhail Maltsev wrote:
 Hi all.
 The attached patch replaces the following typical sequence of
 assignments, matching the pattern:
 
 [type] temp_var = value1;
 value1 = value2;
 value2 = temp_var;
 
 with
 
 std::swap (value1, value2);

I believe that transformation is considered obvious, so you can just go
ahead and commit all the parts that are just that.  That should make
reviewers lives much simpler ;-)

 The goal is to improve code readability (at least a little).
 
 Pattern was searched in GCC tree using a script, but all modifications
 have been done manually to make sure that removing assignment to
 temp_var does not modify the behavior of the code. Where possible,
 temporary variables were removed or moved to blocks with narrower scope.
 I also removed a couple of pre-C++ implementations of swap function and
 one implementation of std::reverse.

I haven't looked at the cases where you removed functions, but I suspect
you can call those obvious too, and save yourself any rebasing pain.

thanks!

Trev



Re: [2/9] Add a REG_NREGS macro

2015-05-18 Thread Jeff Law

On 05/18/2015 12:13 PM, Richard Sandiford wrote:

This patch adds a REG_NREGS macro that by the end of the series
will simply be an rtx field access.  The definition in this patch
is just a placeholder (and so I haven't tried to avoid the double
evaluation of the parameter).

The diff has extra context because in most cases it makes it obvious
that we really are looking at hard_regno_nregs[REGNO (x)][GET_MODE (x)]
for some rtx x.  There are a couple of sites where it's more indirect
though.


gcc/
* rtl.h (REG_NREGS): New macro
* alias.c (record_set): Use it.
* cfgcleanup.c (mark_effect): Likewise.
* combine.c (likely_spilled_retval_1): Likewise.
(likely_spilled_retval_p, can_change_dest_mode): Likewise.
(move_deaths, distribute_notes): Likewise.
* cselib.c (cselib_record_set): Likewise.
* df-problems.c (df_simulate_one_insn_forwards): Likewise.
* df-scan.c (df_mark_reg): Likewise.
* dse.c (look_for_hardregs): Likewise.
* dwarf2out.c (reg_loc_descriptor): Likewise.
(multiple_reg_loc_descriptor): Likewise.
* expr.c (write_complex_part, read_complex_part): Likewise.
(emit_move_complex): Likewise.
* haifa-sched.c (setup_ref_regs): Likewise.
* ira-lives.c (mark_hard_reg_live): Likewise.
* lra.c (lra_set_insn_recog_data): Likewise.
* mode-switching.c (create_pre_exit): Likewise.
* postreload.c (reload_combine_recognize_const_pattern): Likewise.
(reload_combine_recognize_pattern): Likewise.
(reload_combine_note_use, move2add_record_mode): Likewise.
(reload_cse_move2add): Likewise.
* reg-stack.c (subst_stack_regs_pat): Likewise.
* regcprop.c (kill_value, copy_value): Likewise.
(copyprop_hardreg_forward_1): Likewise.
* regrename.c (verify_reg_in_set, scan_rtx_reg): Likewise.
(build_def_use): Likewise.
* sched-deps.c (mark_insn_reg_birth, mark_reg_death): Likewise.
(deps_analyze_insn): Likewise.
* sched-rgn.c (check_live_1, update_live_1): Likewise.
* sel-sched.c (count_occurrences_equiv): Likewise.
* valtrack.c (dead_debug_insert_temp): Likewise.


OK.

Makes me wonder if we want to poison hard_regno_nregs in some way to 
avoid folks re-introducing this stuff in the future...  Your call if you 
want to tackle that as a follow-up.


jeff




Re: [3/9] Clean up df_ref_change_reg_with_loc

2015-05-18 Thread Jeff Law

On 05/18/2015 12:15 PM, Richard Sandiford wrote:

This patch cleans up the interface to df_ref_change_reg_with_loc.
The function is used only by SET_REGNO_RAW, so the old regno is
always REGNO (x) and the caller always goes on to set REGNO.
(And the fuction doesn't make much sense otherwise.)

The patch therefore gets df_ref_change_reg_with_loc to work out the old
regno itself and to install the new register number once it's done.

The check for the old register being -1 was redundant.  Only expr.c
and postreload.c create -1 registers (fixed in a later patch).
Both sites are just creating temporary registers in order to query
backend hooks and neither site needs the check.  expr.c does use
SET_REGNO and so does go through this function, but only at a time
when there's no df information.  (And it wouldn't work if the df
machinery were set up, since any change after the first would
look like a normal change.)  postreload.c uses SET_REGNO_RAW
and so bypasses the code altogether.

expr.c bypasses the code too by the end of the series.


gcc/
* df.h (df_ref_change_reg_with_loc): Remove old_regno parameter.
Change type of new_regno to unsigned int.
* df-scan.c (df_ref_change_reg_with_loc_1): Change type of
new_regno to unsigned int.
(df_ref_change_reg_with_loc): Remove old_regno parameter.
Change type of new_regno to unsigned int.  Use SET_REGNO_RAW.
* rtl.h (SET_REGNO): Update call to df_ref_change_reg_with_loc.
(SET_REGNO_RAW): Add space after ,.

OK.
jeff



Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Torvald Riegel
On Wed, 2015-04-29 at 23:23 -0400, Nuno Diegues wrote: 
 Hello,
 
 I have taken the chance to improve the patch by addressing the
 comments above in this thread.
 Namely:
  - to use a simple random generator managed inside the library only
  - removed floating point usage and replaced by fixed arithmetic
  - added some comments where relevant

First of all, sorry for taking so long to review this.  Thank you for
the contribution.

There are several high-level changes that I'd like to see (or be
convinced of that those aren't necessary).  I'll discuss these first.
I'll have more detailed comments inlined in the patch.


My major concern is about rdtsc being used.  The relation to frequency
adaption that Andi mentioned is one issue, but the bigger issue to me is
that the runtime of transactions might vary a lot, and that the relation
of txnal vs. nontxnal code executed in a period might vary a lot too.

Are there better options for the utility function, or can we tune it to
be less affected by varying txn length and likelihood of txnal vs.
nontxnal code?  What are the things we want to avoid with the tuning?  I
can think of:
* Not needing to wait for serial txns, or being aborted by a serial txn.
* Not retrying using HTM too much so that the retry time is larger than
the scalability we actually gain by having other txns commit
concurrently.

Anything else?  Did you consider other utility functions during your
research?  Also, note that the mitigation strategy for rdtsc
short-comings that you mention in the paper is not applicable in
general, specifically not in libitm.


Another issue is that we need to implement the tuning in a portable way.
You currently make it depend on whether the assembler knows about RTM,
whereas the rest of the code makes this more portable and defers to
arch-specific files.  I'd prefer if we could use the tuning on other
archs too.  But for that, we need to cleanly separate generic from
arch-specific parts.  That affects magic numbers as well as things like
rdtsc().


Generally, the patch needs to include more comments.  Most importantly,
we need to document why we implemented things the way we did, or do
tuning the way we do.  In cases where the intent isn't trivial to see
nor the code is trivial, explain the intent or reference the place where
you document the big picture.
A good rule of thumb is adding comments whenever it's not simple to see
why we use one of several possible implementation alternatives.

The magic numbers being used are a good example.  Make them constants
and don't just put them directly in the code, and then add documentation
why you chose this number and why it's okay to make that choice.  If it
isn't necessarily okay (eg, other archs, future systems), but you don't
have a better solution right now, add something like ??? Not ideal
because of XYZ..  If there's a source for some of the magic numbers,
cite it (e.g., [28] in your paper might be one for the tuning
thresholds, I guess).


Reoptimizing only in a specific, fixed thread is insufficient in the
general case.  There may be a thread that only runs an initial txn and
then never another one; if this happens to be the last thread
registered, you'll never get any tuning.  If we want the tuning to be
effective, one of the threads *actively* running txns needs to tune
eventually, always.

Depending on that, you'll probably have to change the sync between the
tuning thread and others.  Serial mode may be a simple way to do that.
It may be possible to only check for tuning being necessary when in
serial mode.  It could be possible that we end up trying HTM too often
yet never go to serial mode; however, that seems unlikely to me (but I
haven't thought thoroughly about it).

Also, please remember that only data-race-free code is strictly correct
C++ code (the only exception being the validated-loads special case in
the STM implementations, for which C++ doesn't have support yet (but for
which proposals exist)).  Thus, use atomic operations accordingly.  That
might create another issue though in that you can't assume 64b atomic
ops to be supported on all archs.  Maybe using serial mode for tuning
doesn't trigger that issue though.


I'm wondering about whether it really makes sense to treat XABORT like
conflicts and other abort reasons, instead of like capacity aborts.
Perhaps we need to differentiate between the explicit abort codes glibc
uses, so that we can distinguish between cases where an abort is
supposed to signal incompatibility with txnal execution and cases where
it's just used for lock elision (see sysdeps/unix/sysv/linux/x86/hle.h
in current glibc):
#define _ABORT_LOCK_BUSY0xff
#define _ABORT_LOCK_IS_LOCKED   0xfe
#define _ABORT_NESTED_TRYLOCK   0xfd


Andi said that it would be nice to have an env var turning tuning
on/off.  I agree in principle, yet would prefer to have the tuning be
the default.  What about adding an htm_notune option in
parse_default_method?  It would be even nicer to 

RE: Refactor gimple_expr_type

2015-05-18 Thread Aditya K



 Date: Mon, 18 May 2015 12:08:58 +0200
 Subject: Re: Refactor gimple_expr_type
 From: richard.guent...@gmail.com
 To: hiradi...@msn.com
 CC: tbsau...@tbsaunde.org; gcc-patches@gcc.gnu.org

 On Sun, May 17, 2015 at 5:31 PM, Aditya K hiradi...@msn.com wrote:


 
 Date: Sat, 16 May 2015 11:53:57 -0400
 From: tbsau...@tbsaunde.org
 To: hiradi...@msn.com
 CC: gcc-patches@gcc.gnu.org
 Subject: Re: Refactor gimple_expr_type

 On Fri, May 15, 2015 at 07:13:35AM +, Aditya K wrote:
 Hi,
 I have tried to refactor gimple_expr_type to make it more readable. 
 Removed the switch block and redundant if.

 Please review this patch.

 for some reason your mail client seems to be inserting non breaking
 spaces all over the place. Please either configure it to not do that,
 or use git send-email for patches.

 Please see the updated patch.

 Ok if this passed bootstrap and regtest. (I wish if gimple_expr_type
 didn't exist btw...)

Thanks for the review. Do you have any suggestions on how to remove 
gimple_expr_type. Are there any alternatives to it?
I can look into refactoring more (if it is not too complicated) since I'm 
already doing this.

-Aditya


 Thanks,
 Richard.

 gcc/ChangeLog:

 2015-05-15 hiraditya hiradi...@msn.com

 * gimple.h (gimple_expr_type): Refactor to make it concise. Remove redundant 
 if.

 diff --git a/gcc/gimple.h b/gcc/gimple.h
 index 95e4fc8..3a83e8f 100644
 --- a/gcc/gimple.h
 +++ b/gcc/gimple.h
 @@ -5717,36 +5717,26 @@ static inline tree
 gimple_expr_type (const_gimple stmt)
 {
 enum gimple_code code = gimple_code (stmt);
 -
 - if (code == GIMPLE_ASSIGN || code == GIMPLE_CALL)
 + /* In general we want to pass out a type that can be substituted
 + for both the RHS and the LHS types if there is a possibly
 + useless conversion involved. That means returning the
 + original RHS type as far as we can reconstruct it. */
 + if (code == GIMPLE_CALL)
 {
 - tree type;
 - /* In general we want to pass out a type that can be substituted
 - for both the RHS and the LHS types if there is a possibly
 - useless conversion involved. That means returning the
 - original RHS type as far as we can reconstruct it. */
 - if (code == GIMPLE_CALL)
 - {
 - const gcall *call_stmt = as_a const gcall * (stmt);
 - if (gimple_call_internal_p (call_stmt)
 -  gimple_call_internal_fn (call_stmt) == IFN_MASK_STORE)
 - type = TREE_TYPE (gimple_call_arg (call_stmt, 3));
 - else
 - type = gimple_call_return_type (call_stmt);
 - }
 + const gcall *call_stmt = as_a const gcall * (stmt);
 + if (gimple_call_internal_p (call_stmt)
 +  gimple_call_internal_fn (call_stmt) == IFN_MASK_STORE)
 + return TREE_TYPE (gimple_call_arg (call_stmt, 3));
 + else
 + return gimple_call_return_type (call_stmt);
 + }
 + else if (code == GIMPLE_ASSIGN)
 + {
 + if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
 + return TREE_TYPE (gimple_assign_rhs1 (stmt));
 else
 - switch (gimple_assign_rhs_code (stmt))
 - {
 - case POINTER_PLUS_EXPR:
 - type = TREE_TYPE (gimple_assign_rhs1 (stmt));
 - break;
 -
 - default:
 - /* As fallback use the type of the LHS. */
 - type = TREE_TYPE (gimple_get_lhs (stmt));
 - break;
 - }
 - return type;
 + /* As fallback use the type of the LHS. */
 + return TREE_TYPE (gimple_get_lhs (stmt));
 }
 else if (code == GIMPLE_COND)
 return boolean_type_node;


 Thanks,
 -Aditya







 Thanks,
 -Aditya


 gcc/ChangeLog:

 2015-05-15 hiraditya hiradi...@msn.com

 * gimple.h (gimple_expr_type): Refactor to make it concise. Remove 
 redundant if.

 diff --git a/gcc/gimple.h b/gcc/gimple.h
 index 95e4fc8..168d3ba 100644
 --- a/gcc/gimple.h
 +++ b/gcc/gimple.h
 @@ -5717,35 +5717,28 @@ static inline tree
 gimple_expr_type (const_gimple stmt)
 {
 enum gimple_code code = gimple_code (stmt);
 -
 - if (code == GIMPLE_ASSIGN || code == GIMPLE_CALL)
 + tree type;
 + /* In general we want to pass out a type that can be substituted
 + for both the RHS and the LHS types if there is a possibly
 + useless conversion involved. That means returning the
 + original RHS type as far as we can reconstruct it. */
 + if (code == GIMPLE_CALL)
 {
 - tree type;
 - /* In general we want to pass out a type that can be substituted
 - for both the RHS and the LHS types if there is a possibly
 - useless conversion involved. That means returning the
 - original RHS type as far as we can reconstruct it. */
 - if (code == GIMPLE_CALL)
 - {
 - const gcall *call_stmt = as_a const gcall * (stmt);
 - if (gimple_call_internal_p (call_stmt)
 -  gimple_call_internal_fn (call_stmt) == IFN_MASK_STORE)
 - type = TREE_TYPE (gimple_call_arg (call_stmt, 3));
 - else
 - type = gimple_call_return_type (call_stmt);
 - }
 + const gcall *call_stmt = as_a const gcall * (stmt);
 + if (gimple_call_internal_p (call_stmt)
 +  gimple_call_internal_fn (call_stmt) == IFN_MASK_STORE)
 + type = TREE_TYPE (gimple_call_arg (call_stmt, 3));
 + else
 + type = gimple_call_return_type (call_stmt);
 + return 

Re: [8/9] Remove END_HARD_REGNO

2015-05-18 Thread Jeff Law

On 05/18/2015 12:25 PM, Richard Sandiford wrote:

This patch replaces calls to END_HARD_REGNO with calls to END_REGNO and
moves END_REGNO to rtl.h (since it's now just a REG access).


gcc/
* regs.h (END_HARD_REGNO): Delete.
(END_REGNO): Move to...
* rtl.h: ...here.
* bt-load.c (note_btr_set): Use END_REGNO instead of END_HARD_REGNO.
* caller-save.c (mark_set_regs): Likewise.
* combine.c (move_deaths, distribute_notes): Likewise.
* cse.c (invalidate, invalidate_for_call): Likewise.
* df-scan.c (df_ref_record): Likewise.
* postreload-gcse.c (reg_changed_after_insn_p): Likewise.
(record_last_reg_set_info): Likewise.
* reg-stack.c (convert_regs_exit): Likewise.
* reload.c (reg_overlap_mentioned_for_reload_p): Likewise.
* resource.c (update_live_status): Likewise.
* rtlanal.c (find_reg_fusage, find_regno_fusage): Likewise.

OK.
jeff



Re: [4/9] Add a dedicated rtx union member for REGs

2015-05-18 Thread Jeff Law

On 05/18/2015 12:19 PM, Richard Sandiford wrote:

This patch replaces the current REG i0 format with a dedicated structure,
so that we can make use of the extra 32 bits in the i field.

Of the places that iterate on formats and do something for 'i's,
most already handled REGs specially before the format walk and
so don't need to check for 'r'.  Otherwise it's mostly just a case
of adding dummy 'r' cases in order avoid the default gcc_unreachable ()
(in cases where we do the same for 'i').  The main exceptions are the
cselib.c and lra-constraints.c changes.

final.c:leaf_renumber_regs_insn handled REGs specially but then
went on to do a no-op walk of the format.  I just added an early
exit instead of an empty 'r' case.


gcc/
* rtl.def (REG): Change format to r.
* rtl.h (rtunion): Remove rt_reg.
(reg_info): New structure.
(rtx_def): Add reg field to main union.
(X0REGATTR): Delete.
(REG_CHECK): New macro.
(SET_REGNO_RAW, rhs_regno, REG_ATTRS): Use it.
* rtl.c (rtx_format): Document r.
(rtx_code_size): Handle REG specially.
* gengenrtl.c (special_format): Return true for formats
that include 'r'.
* gengtype.c (adjust_field_rtx_def): Handle 'r' fields.
Deal with REG_ATTRS after the field loop.
* emit-rtl.c (gen_raw_REG): Call rtx_alloc_stat directly.
* expmed.c (init_expmed): Call gen_raw_REG instead of
gen_rtx_raw_REG.
* expr.c (init_expr_target): Likewise.
* regcprop.c (maybe_mode_change): Likewise.
* varasm.c (make_decl_rtl): Likewise.
* final.c (leaf_renumber_regs_insn): Return early after
handling REGs.
* genemit.c (gen_exp): Handle 'r' fields.
* genpeep.c (match_rtx): Likewise.
* gensupport.c (subst_pattern_match): Likewise.
(get_alternatives_number, collect_insn_data, alter_predicate_for_insn)
(alter_constraints, subst_dup): Likewise.
* read-rtl.c (read_rtx_code): Likewise.
* print-rtl.c (print_rtx): Likewise.
* genrecog.c (find_operand, find_matching_operand): Likewise.
(validate_pattern, match_pattern_2): Likewise.
(parameter::UINT, rtx_test::REGNO_FIELD): New enum values.
(rtx_test::regno_field): New function.
(operator ==, safe_to_hoist_p, transition_parameter_type)
(parameter_type_string, print_parameter_value)
(print_nonbool_test, print_test): Handle new enum values.
* cselib.c (rtx_equal_for_cselib_1): Handle REG specially.
* lra-constraints.c (operands_match_p): Likewise.
Just to confirm, this doesn't change the size of a REG object, right? 
If it doesn't change the size, then it's OK.


If we make REG objects larger, we potentially increase the amount of 
memory used by GCC.


If we make REG objects smaller, then we run the risk of memory 
overwrites when reload changes REG objects into MEM objects.


The former is bad, the latter is a disaster waiting to happen.

Jeff






Re: [5/9] Create sensible dummy registers

2015-05-18 Thread Jeff Law

On 05/18/2015 12:21 PM, Richard Sandiford wrote:

Some pieces of code create a temporary REG or MEM and only fill it
in later when they're testing the cost of a particular rtx.  This patch
makes sure that even the dummy REG or MEM is valid, rather than force
the gen_* code to handle garbage values.


gcc/
* caller-save.c (init_caller_save): Use word_mode and
FIRST_PSEUDO_REGISTER when creating temporary rtxes.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload.c (reload_cse_regs_1): Likewise.

OK.
jeff



Re: [7/9] Record the number of registers in a REG

2015-05-18 Thread Jeff Law

On 05/18/2015 12:24 PM, Richard Sandiford wrote:

This is the main patch, to record REG_NREGS in the REG itself.
The END_REGNO/END_HARD_REGNO distinction goes away in the next patch.


gcc/
* rtl.h (reg_info): Add an nregs field.
(REG_NREGS): Use it.
(SET_REGNO_RAW): Delete.
(set_regno_raw): New function.
* regs.h (END_HARD_REGNO): Make equivalent to END_REGNO.
(END_REGNO): Redefine in terms of REG_NREGS.
* read-rtl.c (read_rtx_code): Call set_regno_raw instead of
SET_REGNO_RAW.
* emit-rtl.c (set_mode_and_regno): Likewise.
* df-scan.c (df_ref_change_reg_with_loc): Use set_mode_and_regno
instead of SET_REGNO_RAW.

OK.
jeff



Re: [6/9] Pass REG changes through a new function

2015-05-18 Thread Jeff Law

On 05/18/2015 12:23 PM, Richard Sandiford wrote:

This patch adds a new function to set both the mode and regno
of a REG.  It makes sure that all REG PUT_MODE and SET_REGNO changes
go through this function.  There's a new PUT_MODE_RAW (analogous
to SET_REGNO_RAW) for the cases where the caller doesn't want that.

There's a small consistency fix: gen_rtx_REG was declared with
unsigned and defined with unsigned int.  The latter is usual
GCC style.


gcc/
* rtl.h (PUT_MODE_RAW): New macro.
(PUT_REG_NOTE_KIND): Use it.
(set_mode_and_regno): Declare.
(gen_raw_REG): Change regno to unsigned int.
(gen_rtx_REG): Change unsigned to unsigned int.
(PUT_MODE): Forward to PUT_MODE_RAW for generators, otherwise
use set_mode_and_regno to change the mode of registers.
* gengenrtl.c (gendef): Use PUT_MODE_RAW.
* emit-rtl.c (set_mode_and_regno): New function.
(gen_raw_REG): Change regno to unsigned int.  Use set_mode_and_regno.
* caller-save.c (reg_save_code): Use set_mode_and_regno.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload.c (reload_cse_simplify_operands): Likewise.

OK.
jeff



Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Andi Kleen
 Are there better options for the utility function, or can we tune it to

There is nothing better that isn't a lot slower.

-Andi


Re: [PATCH] Refactoring: use std::swap instead of manual swaps

2015-05-18 Thread Jeff Law

On 05/18/2015 12:07 PM, Mikhail Maltsev wrote:

Hi all.
The attached patch replaces the following typical sequence of
assignments, matching the pattern:

[type] temp_var = value1;
value1 = value2;
value2 = temp_var;

with

std::swap (value1, value2);

The goal is to improve code readability (at least a little).

Pattern was searched in GCC tree using a script, but all modifications
have been done manually to make sure that removing assignment to
temp_var does not modify the behavior of the code. Where possible,
temporary variables were removed or moved to blocks with narrower scope.
I also removed a couple of pre-C++ implementations of swap function and
one implementation of std::reverse.

Bootstrapped/regtested on x86_64 linux (multilib). Test for broken
builds currently running. I will also run test on a couple of
simulators (though currently there is a problem with build:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66181).

OK for trunk after complete test?


Did you mean to lose the assignment to heapa-m_nodes in fibonacci_heap.h?

I just spot-checked and it looks fine to me, except for perhaps losing 
the assignment noted above.


jeff


Re: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-05-18 Thread Magnus Granberg
fredag 08 maj 2015 10.35.44 skrev  H.J. Lu:
 On Thu, May 7, 2015 at 2:17 PM, Joseph Myers jos...@codesourcery.com 
wrote:
  On Fri, 6 Mar 2015, H.J. Lu wrote:
  +# We don't want to compile the compiler with -fPIE, it make PCH fail.
  +COMPILER += @NO_PIE_CFLAGS@
  +
  +# Link with -no-pie since we compile the compiler with -fno-PIE.
  +LINKER += @NO_PIE_FLAG@
  
  As I understand it, what we don't want is the compiler to be a PIE.  That
  is, it must be linked -no-pie (and given that the compiler is not a PIE,
  compiling -fPIE would be pointless, although it wouldn't actually break
  things to have PIE objects in the compiler as long as it's linked for a
  fixed address).
  
  +#if defined ENABLE_DEFAULT_PIE
  +#define GNU_USER_TARGET_STARTFILE_SPEC \
  +  %{!shared: %{pg|p|profile:gcrt1.o%s;: \
  +%{ PIE_SPEC :Scrt1.o%s} %{ NO_PIE_SPEC :crt1.o%s}}} \
  +   crti.o%s %{static:crtbeginT.o%s;: %{shared:crtbeginS.o%s} \
  +   %{ PIE_SPEC :crtbeginS.o%s} \
  +   %{ NO_PIE_SPEC :crtbegin.o%s}} \
  +   FVTABLE_VERIFY_SPEC
  +#else
  +#define GNU_USER_TARGET_STARTFILE_SPEC \
  +  %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \
  +   crti.o%s
  %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s} \ +  
  FVTABLE_VERIFY_SPEC
  +#endif
  
  With appropriate definitions of PIE_SPEC and NO_PIE_SPEC, shouldn't a
  single definition of GNU_USER_TARGET_STARTFILE_SPEC be able to work for
  both ENABLE_DEFAULT_PIE and !ENABLE_DEFAULT_PIE?
 
 Yes.
 
  https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00393.html noted a
  possible issue with MIPS.  Actually, rather more config/*.h and
  config/*/*.h headers contain specs testing for (-fpie, -fPIE, -fno-pie,
  -fno-PIE, -pie) options, which would be affected by these changes.  I'd
  say this patch should include an initial attempt at adjusting those config
  headers, which should be an essentially mechanical change not requiring
  understanding anything target-specific.  For link-time specs, that may
  mean using PIE_SPEC and NO_PIE_SPEC.  For compile-time specs, similar new
  macros would be added.  Given such adjustments included in the patch and
  the relevant target maintainers CC:ed, I might then be inclined to approve
  the patch on the basis of allowing a week for target maintainers to test
  the changes for their targets before commit, as I don't see any major
  problems with it beyond the need to update the target-specific specs.
 
 Here is the updated patch.  I will post patches for cris, mips, powerpc
 and sparc separately.  The target maintainers should be able to adjust
 backend ASM_SPEC with FPIE_OR_FPIC_SPEC and
 NO_FPIE_AND_FPIC_SPEC.
 
 OK for trunk?
 
 Thanks.
PIng

Any progress on this?

/Magnus G.



Re: [PATCH/libiberty] fix build of gdb/binutils with clang.

2015-05-18 Thread Ian Lance Taylor
On Mon, May 18, 2015 at 4:26 PM, Yunlian Jiang yunl...@google.com wrote:

 Yes, the problem is  libiberty is compiling some files
 with _GNU_SOURCE defined and some not. So the configure
 file does not include #define _GNU_SOURCE.

As far as I can see it should be fine to define _GNU_SOURCE when
compiling all libiberty files.

Ian


Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Nuno Diegues
On Mon, May 18, 2015 at 5:29 PM, Torvald Riegel trie...@redhat.com wrote:

 First of all, sorry for taking so long to review this.  Thank you for
 the contribution.


Hello Torvald,

thanks for taking the time to look into this!


 My major concern is about rdtsc being used.  The relation to frequency
 adaption that Andi mentioned is one issue, but the bigger issue to me is
 that the runtime of transactions might vary a lot, and that the relation
 of txnal vs. nontxnal code executed in a period might vary a lot too.


Once again, I believe that frequency scaling should not be a concern: recent
CPUs use a constant rate TSC that matches that of the maximum frequency
of the CPU.



 Are there better options for the utility function, or can we tune it to
 be less affected by varying txn length and likelihood of txnal vs.
 nontxnal code?  What are the things we want to avoid with the tuning?  I
 can think of:
 * Not needing to wait for serial txns, or being aborted by a serial txn.
 * Not retrying using HTM too much so that the retry time is larger than
 the scalability we actually gain by having other txns commit
 concurrently.


Yes, those are the key points we want to make sure that do not happen.



 Anything else?  Did you consider other utility functions during your
 research?


The txnal vs nontxnal is indeed a completely different story. To account for
this we would need extra book-keeping to count only cycles spent inside
txnal code. So this would require every thread (or a sample of threads) to
perform a rdtsc (or equivalent) on every begin/end call rather than the
current approach of a single rdtsc per optimization round.

With this type of online optimization we found that the algorithm had to be
very simple and cheap to execute. RDTSC was a good finding to fit this, and
it enabled us to obtain gains. Other time sources failed to do so.

I do not have, out of the box, a good alternative to offer. I suppose it would
take some iterations of thinking/experimenting with, just like with any research
problem :)


 Also, note that the mitigation strategy for rdtsc
 short-comings that you mention in the paper is not applicable in
 general, specifically not in libitm.


I suppose you mean the preemption of threads inflating the cycles measured?
This would be similarly a problem to any time source that tries to measure the
amount of work performed; not sure how we can avoid it in general. Any thoughts?


 Another issue is that we need to implement the tuning in a portable way.
 You currently make it depend on whether the assembler knows about RTM,
 whereas the rest of the code makes this more portable and defers to
 arch-specific files.  I'd prefer if we could use the tuning on other
 archs too.  But for that, we need to cleanly separate generic from
 arch-specific parts.  That affects magic numbers as well as things like
 rdtsc().


Yes, I refrained from adding new calls to the arch-specific files, to
contain the
changes mainly. But that is possible and that's part of the feedback I
was hoping
to get.



 Generally, the patch needs to include more comments.  Most importantly,
 we need to document why we implemented things the way we did, or do
 tuning the way we do.  In cases where the intent isn't trivial to see
 nor the code is trivial, explain the intent or reference the place where
 you document the big picture.
 A good rule of thumb is adding comments whenever it's not simple to see
 why we use one of several possible implementation alternatives.

 The magic numbers being used are a good example.  Make them constants
 and don't just put them directly in the code, and then add documentation
 why you chose this number and why it's okay to make that choice.  If it
 isn't necessarily okay (eg, other archs, future systems), but you don't
 have a better solution right now, add something like ??? Not ideal
 because of XYZ..  If there's a source for some of the magic numbers,
 cite it (e.g., [28] in your paper might be one for the tuning
 thresholds, I guess).


Ack, makes perfect sense.



 Reoptimizing only in a specific, fixed thread is insufficient in the
 general case.  There may be a thread that only runs an initial txn and
 then never another one; if this happens to be the last thread
 registered, you'll never get any tuning.  If we want the tuning to be
 effective, one of the threads *actively* running txns needs to tune
 eventually, always.

 Depending on that, you'll probably have to change the sync between the
 tuning thread and others.  Serial mode may be a simple way to do that.
 It may be possible to only check for tuning being necessary when in
 serial mode.  It could be possible that we end up trying HTM too often
 yet never go to serial mode; however, that seems unlikely to me (but I
 haven't thought thoroughly about it).

 Also, please remember that only data-race-free code is strictly correct
 C++ code (the only exception being the validated-loads special case in
 the STM implementations, 

Re: [PATCH] Refactoring: use std::swap instead of manual swaps

2015-05-18 Thread Mikhail Maltsev
On 19.05.2015 0:39, Jeff Law wrote:
 Did you mean to lose the assignment to heapa-m_nodes in fibonacci_heap.h?
 
 I just spot-checked and it looks fine to me, except for perhaps losing
 the assignment noted above.
 
 jeff
Oops. Fixed that and committed to trunk (just in case, I re-ran tests on
x86_64 linux + checked that build does not break for various targets.
Simulators that I tried to use are still broken, unfortunately).

-- 
Regards,
Mikhail Maltsev
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c00a878..44017c1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,52 @@
+2015-05-19  Mikhail Maltsev  malts...@gmail.com
+
+   * bb-reorder.c (fix_up_fall_thru_edges): Use std::swap instead of
+   explicit swaps.
+   * dojump.c (do_compare_rtx_and_jump): Likewise.
+   * expmed.c (emit_store_flag_1): Likewise.
+   * fibonacci_heap.h (fibonacci_heap::union_with): Likewise.
+   * final.c (sprint_ul): Use std::reverse for reversing a string.
+   * fold-const.c (extract_muldiv_1): Use std::swap.
+   * genmodes.c (emit_mode_int_n): Likewise.
+   * ifcvt.c (dead_or_predicable): Likewise.
+   * ira-build.c (ira_merge_live_ranges): Likewise.
+   (swap_allocno_copy_ends_if_necessary): Likewise.
+   * ira.c (ira_setup_alts): Likewise.
+   * loop-iv.c (iv_analyze_expr): Likewise.
+   (implies_p): Likewise.
+   (canon_condition): Likewise.
+   * lra-constraints.c (swap_operands): Likewise.
+   * lra-lives.c (lra_merge_live_ranges): Likewise.
+   * omega.c (swap): Remove.
+   (bswap): Remove.
+   (omega_unprotect_1): Use std::swap.
+   (omega_solve_geq): Likewise.
+   * optabs.c (expand_binop_directly): Likewise.
+   (expand_binop): Likewise.
+   (emit_conditional_move): Likewise.
+   (emit_conditional_add): Likewise.
+   * postreload.c (reload_cse_simplify_operands): Likewise.
+   * reg-stack.c (emit_swap_insn): Likewise.
+   (swap_to_top): Likewise.
+   (compare_for_stack_reg): Likewise.
+   (subst_asm_stack_regs): Likewise.
+   * reload.c (find_reloads): Likewise.
+   * reload1.c (gen_reload_chain_without_interm_reg_p): Likewise.
+   * sel-sched.c (invoke_reorder_hooks): Likewise.
+   (create_block_for_bookkeeping): Likewise.
+   * tree-data-ref.c (lambda_matrix_row_exchange): Remove.
+   (lambda_matrix_right_hermite): Use std::swap.
+   * tree-ssa-coalesce.c (sort_coalesce_list): Likewise.
+   * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
+   * tree-ssa-loop-ivopts.c (iv_ca_delta_reverse): Likewise.
+   * tree-ssa-math-opts.c (is_widening_mult_p): Likewise.
+   * tree-ssa-phiopt.c (hoist_adjacent_loads): Likewise.
+   * tree-ssa-reassoc.c (linearize_expr_tree): Likewise.
+   * tree-ssa-threadedge.c (simplify_control_stmt_condition): Likewise.
+   * tree-vrp.c (compare_ranges): Likewise.
+   * var-tracking.c (add_with_sets): Likewise.
+   (vt_find_locations): Likewise.
+
 2015-05-18  Andreas Tobler  andre...@gcc.gnu.org
 
* config/freebsd-spec.h (FBSD_STARTFILE_SPEC): Add the bits to build
diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index c134712..a97c2cd 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -1812,7 +1812,6 @@ fix_up_fall_thru_edges (void)
   edge succ2;
   edge fall_thru;
   edge cond_jump = NULL;
-  edge e;
   bool cond_jump_crosses;
   int invert_worked;
   rtx_insn *old_jump;
@@ -1901,9 +1900,7 @@ fix_up_fall_thru_edges (void)
  fall_thru-flags = ~EDGE_FALLTHRU;
  cond_jump-flags |= EDGE_FALLTHRU;
  update_br_prob_note (cur_bb);
- e = fall_thru;
- fall_thru = cond_jump;
- cond_jump = e;
+ std::swap (fall_thru, cond_jump);
  cond_jump-flags |= EDGE_CROSSING;
  fall_thru-flags = ~EDGE_CROSSING;
}
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index a6abd60..bf403ce 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,7 @@
+2015-05-19  Mikhail Maltsev  malts...@gmail.com
+
+   * c-common.c (shorten_compare): Use std::swap instead of explicit swaps.
+
 2015-05-18  Tom de Vries  t...@codesourcery.com
 
* c-common.c (build_va_arg_1): New function.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 6d85049..3998b23 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -4372,20 +4372,12 @@ shorten_compare (location_t loc, tree *op0_ptr, tree 
*op1_ptr,
!integer_zerop (primop1)  !real_zerop (primop1)
!fixed_zerop (primop1))
 {
-  tree tem = primop0;
-  int temi = unsignedp0;
-  primop0 = primop1;
-  primop1 = tem;
-  tem = op0;
-  op0 = op1;
-  op1 = tem;
+  std::swap (primop0, primop1);
+  std::swap (op0, op1);
  

Fwd: Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-05-18 Thread Mikhail Maltsev
It seems rather weird to me: I have definitely sent this message, but
now I realized that it did not get to the mailing list archive (and it
was not bounced either). Thus, resending.

On 12.05.2015 23:10, Jeff Law wrote:
 There's a makefile fragment in contrib which will build a large number
 of targets that you might find helpful.  Of course without some baseline
 to compare against, it's of less value.

Thanks for a hint. I tested the build using this set of targets.
Currently there are some targets that fail to build from trunk (due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65835 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55035). Other targets are
OK, and the patch does not introduce any failures.

The attached archive contains the latest version of the patch (no
significant differences since previous one; I just combined everything
in one file and rebased) and updated changelog. Bootstrapped + regtested
on x86_64-unknown-linux-gnu{,-m32} and built config-list.mk. OK for trunk?

-- 
Regards,
Mikhail Maltsev





as_insn6.tar.gz
Description: GNU Zip compressed data


[match-and-simplify] report error for invalid operator-lists

2015-05-18 Thread Prathamesh Kulkarni
Hi,
genmatch segfaults on:
(define_operator_list op (plus))

The above syntax is invalid, and it segfaults on:
fatal_error (token, operator list is empty);
because token is NULL.

The patch puts a check for CPP_CLOSE_PAREN after parsing id-list.
OK for trunk after bootstrap+testing completes ?

Thanks,
Prathamesh
Index: genmatch.c
===
--- genmatch.c  (revision 223294)
+++ genmatch.c  (working copy)
@@ -3427,6 +3427,11 @@
op-substitutes.safe_push (idb);
 }
 
+  // Check that there is no junk after id-list
+  token = peek();
+  if (token-type != CPP_CLOSE_PAREN)
+fatal_at (token, expected identifier got %s, cpp_type2name (token-type, 
0));
+
   if (op-substitutes.length () == 0)
 fatal_at (token, operator-list cannot be empty);
 
2015-05-19  Prathamesh Kulkarni  prathamesh.kulka...@linaro.org

* genmatch.c (parser::parse_operator_list): Check for CPP_CLOSE_PAREN 
after end of id-list.


Re: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-05-18 Thread H.J. Lu
On Mon, May 18, 2015 at 6:11 PM, Joseph Myers jos...@codesourcery.com wrote:
 On Tue, 19 May 2015, Magnus Granberg wrote:

   https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00393.html noted a
   possible issue with MIPS.  Actually, rather more config/*.h and
   config/*/*.h headers contain specs testing for (-fpie, -fPIE, -fno-pie,
   -fno-PIE, -pie) options, which would be affected by these changes.  I'd
   say this patch should include an initial attempt at adjusting those 
   config
   headers, which should be an essentially mechanical change not requiring
   understanding anything target-specific.  For link-time specs, that may
   mean using PIE_SPEC and NO_PIE_SPEC.  For compile-time specs, similar new
   macros would be added.  Given such adjustments included in the patch and
   the relevant target maintainers CC:ed, I might then be inclined to 
   approve
   the patch on the basis of allowing a week for target maintainers to test
   the changes for their targets before commit, as I don't see any major
   problems with it beyond the need to update the target-specific specs.
 
  Here is the updated patch.  I will post patches for cris, mips, powerpc
  and sparc separately.  The target maintainers should be able to adjust
  backend ASM_SPEC with FPIE_OR_FPIC_SPEC and
  NO_FPIE_AND_FPIC_SPEC.
 
  OK for trunk?
 
  Thanks.
 PIng

 Any progress on this?

 Have updates for all affected specs for all targets been posted?  I just
 saw a small and apparently arbitrary subset of targets with patches, and
 no explanation of how those targets were identified or why the other
 targets with specs mentioning the options in question did not need
 updates.


I only posted patches for an  arbitrary subset of targets because

1. Not everyone is interested in --enable-default-pie.
2. I can't tests all targets myself.

If patches for all targets is the only blocker before the patch
will be approved or target maintainers will help me test the patch,
I will post patches for each target affected.


-- 
H.J.


Re: [PATCH/libiberty] fix build of gdb/binutils with clang.

2015-05-18 Thread Yunlian Jiang
Yes, the problem is  libiberty is compiling some files
with _GNU_SOURCE defined and some not. So the configure
file does not include #define _GNU_SOURCE.

I have reduced the failing compiling as below.

#define _GNU_SOURCE  (1)
#define ATTRIBUTE_NONNULL(m) __attribute__ ((__nonnull__ (m)))
#define ATTRIBUTE_PRINTF(m, n) __attribute__ ((__format__ (__printf__,
m, n))) ATTRIBUTE_NONNULL(m)
#define ATTRIBUTE_PRINTF_2 ATTRIBUTE_PRINTF(2, 3)
#include stdio.h
//#define _GNU_SOURCE //   (2)
extern int asprintf (char **, const char *, ...) ATTRIBUTE_PRINTF_2;

it fails with command line
/usr/bin/clang -fPIE -D_FORTIFY_SOURCE=2 -O2 -c f.c -o f.o

If the _GNU_SOURCE is placed at (1), then the asprintf in the test
case is a macro and it is expanded to

extern int __asprintf_chk (char **, 2 - 1, const char *, ...)
__attribute__ ((__format__ (__printf__, 2, 3))) __attribute__
((__nonnull__ (2)));

in the pre-processed file and causes compilation error.

If the  _GNU_SOURCE is placed at (2) or after stdio.h, then asprintf
in the test case
is still a function in the pre-processed file and it compiles.
extern int asprintf (char **, const char *, ...) __attribute__
((__format__ (__printf__, 2, 3))) __attribute__ ((__nonnull__ (2)));


I then looked at the /usr/include/bits/stdio2.h,  it has
# ifdef __USE_GNU
...
#  ifdef __va_arg_pack
__fortify_function int
__NTH (asprintf (char **__restrict __ptr, const char *__restrict __fmt, ...))
{
  return __asprintf_chk (__ptr, __USE_FORTIFY_LEVEL - 1, __fmt,
 __va_arg_pack ());
}

__fortify_function int
__NTH (__asprintf (char **__restrict __ptr, const char *__restrict __fmt,
   ...))
{
  return __asprintf_chk (__ptr, __USE_FORTIFY_LEVEL - 1, __fmt,
 __va_arg_pack ());
}

__fortify_function int
__NTH (obstack_printf (struct obstack *__restrict __obstack,
   const char *__restrict __fmt, ...))
{
  return __obstack_printf_chk (__obstack, __USE_FORTIFY_LEVEL - 1, __fmt,
   __va_arg_pack ());
}
#  elif !defined __cplusplus
#   define asprintf(ptr, ...) \
  __asprintf_chk (ptr, __USE_FORTIFY_LEVEL - 1, __VA_ARGS__)
#   define __asprintf(ptr, ...) \
  __asprintf_chk (ptr, __USE_FORTIFY_LEVEL - 1, __VA_ARGS__)
#   define obstack_printf(obstack, ...) \
  __obstack_printf_chk (obstack, __USE_FORTIFY_LEVEL - 1, __VA_ARGS__)
#  endif
# endif

It requires gcc to be at least 4.3 to have __va_arg_pack defined.
Clang only has gcc 4.2. So clang chooses
the asprintf macro and causes the problem.

On Mon, May 4, 2015 at 5:30 PM, Ian Lance Taylor i...@google.com wrote:
 On Mon, May 4, 2015 at 3:49 PM, Yunlian Jiang yunl...@google.com wrote:
 There was a similar disscussion here
 https://gcc.gnu.org/ml/gcc/2005-11/msg01190.html

 That was a discussion about libiberty.  Your subject says you have
 trouble building gdb.

 Can you describe the exact problem that you are having?  What
 precisely are you doing?  What precisely happens?


 The problem is in the configure stage, the __GNU_SOURCE is not
 defined, and it could not find
 the declaration of asprintf. so it make a declaration of asprintf in
 libiberty.h. And  for the file floatformat.c,
 the  __GNU_SOURCE is defined, so it could find another asprintf in
 /usr/include/bits/stdio2.h, it also includes
 libiberty.h. So these two asprintf conflicts when __USE_FORTIFY_LEVEL is set.

 I think the basic guideline should be that HAVE_DECL_ASPRINTF should
 be correct.  If libiberty compiled with _GNU_SOURCE defined, then it
 should test HAVE_DECL_ASPRINTF with _GNU_SOURCE defined.  If not, then
 not.  So perhaps the problem is that libiberty is compiling some files
 with _GNU_SOURCE defined and some not.

 Ian


Re: [match-and-simplify] fix incorrect code-gen in 'for' pattern

2015-05-18 Thread Prathamesh Kulkarni
On 18 May 2015 at 20:17, Prathamesh Kulkarni
prathamesh.kulka...@linaro.org wrote:
 On 18 May 2015 at 14:12, Richard Biener rguent...@suse.de wrote:
 On Sat, 16 May 2015, Prathamesh Kulkarni wrote:

 Hi,
 genmatch generates incorrect code for following (artificial) pattern:

 (for op (plus)
   op2 (op)
   (simplify
 (op @x @y)
 (op2 @x @y)

 generated gimple code: http://pastebin.com/h1uau9qB
 'op' is not replaced in the generated code on line 33:
 *res_code = op;

 I think it would be a better idea to make op2 iterate over same set
 of operators (op2-substitutes = op-substitutes).
 I have attached patch for the same.
 Bootstrap + testing in progress on x86_64-unknown-linux-gnu.
 OK for trunk after bootstrap+testing completes ?

 Hmm, but then the example could as well just use 'op'.  I think we
 should instead reject this.

 Consider

   (for op (plus minus)
 (for op2 (op)
   (simplify ...

 where it is not clear what would be desired.  Simple replacement
 of 'op's value would again just mean you could have used 'op' in
 the first place.  Doing what you propose would get you

   (for op (plus minus)
 (for op2 (plus minus)
   (simplify ...

 thus a different iteration.

 I wonder if we really need is_oper_list flag in user_id ?
 We can determine if user_id is an operator list
 if user_id::substitutes is not empty ?

 After your change yes.

 That will lose the ability to distinguish between user-defined operator
 list and list-iterator in for like op/op2, but I suppose we (so far) don't
 need to distinguish between them ?

 Well, your change would simply make each list-iterator a (temporary)
 user-defined operator list as well as the current iterator element
 (dependent on context - see the nested for example).  I think that
 adds to confusion.
AFAIU, the way it's implemented in lower_for, the iterator is handled
the same as a user-defined operator
list. I was wondering if we should get rid of 'for' altogether and
have it replaced
by operator-list ?

IMHO having two different things - iterator and operator-list is
unnecessary and we could
brand iterator as a local operator-list. We could extend syntax of 'simplify'
to accommodate local operator-lists.

So we can say, using an operator-list within 'match' replaces it by
corresponding operators in that list.
Operator-lists can be global (visible to all patterns), or local to
a particular pattern.

eg:
a) single for
(for op (...)
  (simplify
(op ...)))

can be written as:
(simplify
  op (...)  // define local operator-list op.
  (op ...)) // proceed here the same way as for lowering global operator list.

b) multiple iterators:
(for op1 (...)
  op2 (...)
  (simplify
(op1 (op2 ...

can be written as:
(simplify
  op1 (...)
  op2 (...)
  (op1 (op2 ...)))

c) nested for
(for op1 (...)
(for op2 (...)
  (simplify
(op1 (op2 ...

can be written as:

(simplify
  op1 (...)
  (simplify
op2 (...)
(op1 (op2 ...

My rationale behind removing 'for' is we don't need to distinguish
between an operator-list and iterator,
and only have an operator-list -;)
Also we can reuse parser::parse_operator_list (in parser::parse_for
parsing oper-list is duplicated)
and get rid of 'parser::parse_for'.
We don't need to change lowering, since operator-lists are handled
the same way as 'for' (we can keep lowering of simplify::for_vec as it is).

Does it sound reasonable ?

Thanks,
Prathamesh

 So - can you instead reject this use?
 Well my intention was to have support for walking operator list in reverse.
 For eg:
 (for bitop (bit_and bit_ior)
   rbitop (bit_ior bit_and)
...)
 Could be replaced by sth like:
 (for bitop (bit_and bit_ior)
   rbitop (~bitop))
...)

 where ~bitop would indicate walking (bit_and bit_ior) in reverse.
 Would that be a good idea ? For symmetry, I thought
 (for op (list)
   op2 (op))
 should be supported too.


 Thanks,
 Prathamesh



 Thanks,
 Richard.


Re: [PATCH, RS6000] gpc_reg_operand

2015-05-18 Thread David Edelsohn
On Mon, May 18, 2015 at 9:09 PM, Alan Modra amo...@gmail.com wrote:
 gpc_reg_operand contains a test that dates back to the early 1990s,
 when FIRST_PSEUDO_REGISTER was 76.  At that point, testing
 REGNO (op) = ARG_POINTER_REGNUM allowed ap, xer/ca, cr0..cr7 and
 pseudos.  A later patch corrected this to remove xer.  Nowadays we
 have a lot more hard registers, altivec, vsx, htm, spe hi.  Some of
 these are definitely special registers so not appropriate for
 gpc_reg_operand to match.  This patch corrects the situation, to
 just allow pseudos, int, fp, altivec and vsx.  ap and sfp are allowed
 via INT_REGNO_P.  Note that I've removed cr0..cr7.  I believe this is
 OK along with the extzvdi_internal2 change.  I checked all uses of
 gpc_reg_operand and predicates derived from gpc_reg_operand.

 The patch also removes some operand predicates I discovered were
 unused, and tightens the vr save/restore patterns I added some time
 ago.  Bootstrapped and regression tested powerpc64le-linux and
 powerpc64-linux, the latter with -mcpu=power7.  OK for mainline?

 * config/rs6000/predicates.md (gpc_reg_operand): Don't allow all
 hard registers numbered greater or equal to ARG_POINTER_REGNUM.
 (reg_or_neg_short_operand, fix_trunc_dest_operand): Delete
 unused predicates.
 * config/rs6000/altivec.md (save_vregs_*, restore_vregs_*):
 Use altivec_register_operand.  Make insn predicate TARGET_ALTIVEC.
 * config/rs6000/rs6000.md (extzvdi_internal2): Use cc_reg_operand.
 * config/rs6000/vsx.md (vsx_floatVSimode2): Expand comment.

Okay.

Thanks, David


Re: [PATCH] FreeBSD add functionality to build PIE executables.

2015-05-18 Thread Andreas Tobler

On 18.05.15 19:36, Jeff Law wrote:

On 05/17/2015 02:42 PM, Andreas Tobler wrote:

Ping?!

Thanks,
Andreas

On 11.05.15 22:34, Andreas Tobler wrote:

All,

this patch adds the ability to build PIE executables for FreeBSD. The
core is since a longer time in the code base of FreeBSD itself and is
working fine.

This patch makes it available for all FreeBSD targets.
Tested on x86_64-*-freebsd11.0 and armv6/hf-*-freebsd11.0,
i386-*-freebsd11.0 is progress.

In the same turn I removed the STARTFILE/ENDFILE_SPEC from
config/i386/freebsd.h and use the ones from config/freebsd-spec.h.

Here the results before the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01267.html

and with the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01324.html

Is this ok for trunk and for 5.1X?

Thanks,
Andreas

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

 * config/freebsd-spec.h (FBSD_STARTFILE_SPEC): Add the bits to build
 pie executables.
 (FBSD_ENDFILE_SPEC): Likewise.
 * config/i386/freebsd.h (STARTFILE_SPEC): Remove and use the one from
 config/freebsd-spec.h.
 (ENDFILE_SPEC): Likewise.

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

 * lib/target-supports.exp (check_effective_target_pie): Add
*-*-freebsd*
 to the familiy of pie capable targets.

OK for the trunk, please install.  Release managers have final say for
the release branches.



Done.

Thanks for the review.
Andreas



Re: [9/9] Simplify register bitmap operations

2015-05-18 Thread Jeff Law

On 05/18/2015 12:27 PM, Richard Sandiford wrote:

After the previous changes, several callers are left with code like:

   if (HARD_REGISTER_P (x))
 bitmap_foo_range (x, REGNO (x), REG_NREGS (x));
   else
 bitmap_foo (x, REGNO (x));

These might as well now be:

   if (REG_NREGS (x)  1)
 bitmap_foo_range (x, REGNO (x), REG_NREGS (x));
   else
 bitmap_foo (x, REGNO (x));

since REG_NREGS is as cheap to test, and since single-register hard REGs
are the common case.  But if separating out the cases is a win -- and it
seems to be, very slightly -- then it would be better to add the shortcut
to the range functions themselves.


gcc/
* bitmap.c (bitmap_set_range): Handle count==1 specially.
(bitmap_clear_range): Likewise.
* cfgcleanup.c (mark_effect): Use bitmap_clear_range and
bitmap_set_range unconditionally.
* df-problems.c (df_simulate_one_insn_forwards): Likewise.
* df-scan.c (df_mark_reg): Likewise.
* haifa-sched.c (setup_ref_regs): Likewise.
* sched-rgn.c (update_live_1): Likewise.

OK.
jeff



Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Torvald Riegel
On Mon, 2015-05-18 at 23:39 +0200, Andi Kleen wrote:
  Are there better options for the utility function, or can we tune it to
 
 There is nothing better that isn't a lot slower.

Do you care to elaborate why?  As-is, I find this statement to not be
convincing; at the very least we need to document why we think that
something time-based is the best option.

Other tuning attempts have looked at rates of aborted, attempted, and
committed txns, for example.  Why do we measure nb. of transactions in a
whole period, and can't get the same info through measuring smaller but
more specific time intervals (e.g., how long we wait for a serial txn)?




Re: [PATCH] fortran/66052 -- Prevent dereference of NULL pointer

2015-05-18 Thread Steve Kargl
On Sat, May 16, 2015 at 01:58:30PM +0200, Mikael Morin wrote:
  Index: gcc/fortran/decl.c
  ===
  --- gcc/fortran/decl.c  (revision 223094)
  +++ gcc/fortran/decl.c  (working copy)
  @@ -6968,7 +6968,8 @@ gfc_match_protected (void)
 gfc_symbol *sym;
 match m;
   
  -  if (gfc_current_ns-proc_name-attr.flavor != FL_MODULE)
  +  if (gfc_current_ns-proc_name
  +   gfc_current_ns-proc_name-attr.flavor != FL_MODULE)
   {
  gfc_error (PROTECTED at %C only allowed in specification 
part of a module);
 
 Wouldn't one get a slightly better error message if using
   !gfc_current_ns-proc_name
   || gfc_current_ns-proc_name-attr.flavor != FL_MODULE
 as condition ?
 
 OK with that change.  Thanks.
 

Committed ar r223324 with your suggested change.  Indeed, 
a better error messaage emitted.

-- 
Steve


Re: [PATCH] fortran/66057 -- detect malformed GENERIC statement

2015-05-18 Thread Steve Kargl
On Fri, May 15, 2015 at 10:37:24PM +0200, FX wrote:
 Does it also catch the other cases shown in the PR? Things like:
 
   generic :: 2
   generic :: 2 =
   generic :: 1 = x
   generic :: ?
 

Yes, the patch appears to catch those as well.

-- 
Steve


[Committed] check_GNU_style.sh: Improve readability function calls

2015-05-18 Thread Tom de Vries

Hi,

committed as obvious.

Thanks,
- Tom
check_GNU_style.sh: Improve readability function calls

2015-05-18  Tom de Vries  t...@codesourcery.com

	* check_GNU_style.sh: Improve readability function calls.
---
 contrib/check_GNU_style.sh | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index 90c612f..b86c474 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -170,11 +170,12 @@ g 'Sentences should end with a dot.  Dot, space, space, end of the comment.' \
 '[[:alnum:]][[:blank:]]*\*/'
 
 vg 'There should be exactly one space between function name and parentheses.' \
-'\#define' '[[:alnum:]]([[:blank:]]{2,})?\('
+'\#define' \
+'[[:alnum:]]([[:blank:]]{2,})?\('
 
 g 'There should be no space before closing parentheses.' \
 '[[:graph:]][[:blank:]]+\)'
 
 ag 'Braces should be on a separate line.' \
-'\{' 'if[[:blank:]]\(|while[[:blank:]]\(|switch[[:blank:]]\('
-
+'\{' \
+'if[[:blank:]]\(|while[[:blank:]]\(|switch[[:blank:]]\('
-- 
1.9.1



check_GNU_style.sh: Don't cat empty file

2015-05-18 Thread Tom de Vries

Hi,

committed as obvious.

Thanks,
- Tom
check_GNU_style.sh: Don't cat empty file

2015-05-18  Tom de Vries  t...@codesourcery.com

	* check_GNU_style.sh (g, ag, vg): Don't cat empty file.
---
 contrib/check_GNU_style.sh | 30 --
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index 728c11a..ab59b1e 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -84,10 +84,16 @@ grep $format '^+' $files \
 g (){
 local msg=$1
 local arg=$2
+
+local found=false
 cat $inp \
 	| egrep --color=always -- $arg \
-	 $tmp  printf \n$msg\n
-cat $tmp
+	 $tmp  found=true
+
+if $found; then
+	printf \n$msg\n
+	cat $tmp
+fi
 }
 
 # And Grep
@@ -95,11 +101,17 @@ ag (){
 local msg=$1
 local arg1=$2
 local arg2=$3
+
+local found=false
 cat $inp \
 	| egrep --color=always -- $arg1 \
 	| egrep --color=always -- $arg2 \
-	 $tmp  printf \n$msg\n
-cat $tmp
+	 $tmp  found=true
+
+if $found; then
+	printf \n$msg\n
+	cat $tmp
+fi
 }
 
 # reVerse Grep
@@ -107,11 +119,17 @@ vg (){
 local msg=$1
 local varg=$2
 local arg=$3
+
+local found=false
 cat $inp \
 	| egrep -v -- $varg \
 	| egrep --color=always -- $arg \
-	 $tmp  printf \n$msg\n
-cat $tmp
+	 $tmp  found=true
+
+if $found; then
+	printf \n$msg\n
+	cat $tmp
+fi
 }
 
 col (){
-- 
1.9.1



[Committed] check_GNU_style.sh: Declare local vars with local

2015-05-18 Thread Tom de Vries

Hi,

committed as obvious.

Thanks,
- Tom
check_GNU_style.sh: Declare local vars with local

2015-05-18  Tom de Vries  t...@codesourcery.com

	* check_GNU_style.sh (g, ag, vg, col): Declare local vars with local.
---
 contrib/check_GNU_style.sh | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index b86c474..728c11a 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -82,8 +82,8 @@ grep $format '^+' $files \
 
 # Grep
 g (){
-msg=$1
-arg=$2
+local msg=$1
+local arg=$2
 cat $inp \
 	| egrep --color=always -- $arg \
 	 $tmp  printf \n$msg\n
@@ -92,9 +92,9 @@ g (){
 
 # And Grep
 ag (){
-msg=$1
-arg1=$2
-arg2=$3
+local msg=$1
+local arg1=$2
+local arg2=$3
 cat $inp \
 	| egrep --color=always -- $arg1 \
 	| egrep --color=always -- $arg2 \
@@ -104,9 +104,9 @@ ag (){
 
 # reVerse Grep
 vg (){
-msg=$1
-varg=$2
-arg=$3
+local msg=$1
+local varg=$2
+local arg=$3
 cat $inp \
 	| egrep -v -- $varg \
 	| egrep --color=always -- $arg \
@@ -115,7 +115,7 @@ vg (){
 }
 
 col (){
-msg=$1
+local msg=$1
 local first=true
 local f
 for f in $files; do
-- 
1.9.1



Re: [PATCH 3/4] split-stack for powerpc64

2015-05-18 Thread Alan Modra
On Mon, May 18, 2015 at 12:24:51PM +0930, Alan Modra wrote:
 +error (%-fsplit-stack% currently only supported on PowerPC64 
 GNU/Linux with glibc-2.18 or later);

I forgot to comment on this.  2.19 is actually when __private_ss
appeared in the ppc tcbhead_t, but I misread the commit date and
thought it was 2.18.  I was going to correct that, but then wondered
if glibc allocates any spare bytes, and it looks like it does.  We
have a struct pthread before tcbhead_t, and struct pthread is aligned
according to TCB_ALIGNMENT which is 16 for powerpc.  So 2.18's 56
byte tcbhead_t, which is laid out so that the end coincides with
tp-0x7000, must be preceded by 8 bytes of padding.  Enough for
__private_ss, and we don't care if it isn't initially zero.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH][PR66010] Don't take address of ap unless necessary

2015-05-18 Thread Richard Biener
On Thu, 14 May 2015, Tom de Vries wrote:

 On 12-05-15 12:04, Richard Biener wrote:
   The fact that we have to handle this specially in both build_va_arg and
   gimplify_va_arg makes me wonder whether we should be dealing with all
   this in
   build_va_arg already.
   
   That is, determine whether we take the address, and add the address
   operator
   if so in build_va_arg already. Likewise, determine do_deref and pass that
   as
   extra operand.
  That would certainly be a nice cleanup.
 
 I've implemented this cleanup.
 
 Furthermore, I realized that in expand_ifn_va_arg_1, 'do_deref == 0' disables
 code:
 ...
   if (do_deref == integer_one_node)
 ap = build_fold_indirect_ref (ap);
 ...
 
 that has a copy in the target hook:
 ...
gpr = build3 (COMPONENT_REF, TREE_TYPE (f_gpr),
   build_va_arg_indirect_ref (valist), f_gpr, NULL_TREE);
   valist = build_va_arg_indirect_ref (valist);
 ...
 
 It's easier to remove that code from the few target hooks that do that,
 execute the 'do_deref == 1' code unconditionally in expand_ifn_va_arg_1 and
 simply remove all do_deref propagation (that also means we don't have to add
 an operand to VA_ARG_EXPR).
 
 
 Bootstrapped and reg-tested on x86_64, with and without -m32.
 
 Did partial build and compiled a few va_arg test-cases on the other targets.
 
 OK for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 - Tom
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)


[PATCH][calls.c] Remove #ifdef checks on STACK_GROWS_DOWNWARD

2015-05-18 Thread Kyrill Tkachov

Hi all,

As per https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01166.html
I saw that STACK_GROWS_DOWNWARD is being checked for ifdef in a number of places
unrelated to the patch in the PR, so decided to remove the ifdef checks for 
STACK_GROWS_DOWNWARD
in a separate patch. It's fairly mechanical.

Bootstrapped on x86_64 and aarch64 and tested on arm.
I believe these patches are considered obvious, so I'll commit this in a day, 
unless someone objects,
before proceeding with the changes in 
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01166.html

2015-05-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* calls.c: Always define STACK_GROWS_DOWNWARD as 0 or 1.
(mem_overlaps_already_clobbered_arg_p): Rewrite ifdef
STACK_GROWS_DOWNWARD as normal if.
(expand_call): Likewise.
commit 13a36809b55631dbc2c7e7dfb564857f6fb74339
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed May 13 10:20:59 2015 +0100

[calls.c] Remove #ifdef checks on STACK_GROWS_DOWNWARD

diff --git a/gcc/calls.c b/gcc/calls.c
index caa7d60..e11b1b9 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -81,6 +81,15 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-chkp.h
 #include rtl-chkp.h
 
+
+/* Redefine STACK_GROWS_DOWNWARD in terms of 0 or 1.  */
+#ifdef STACK_GROWS_DOWNWARD
+# undef STACK_GROWS_DOWNWARD
+# define STACK_GROWS_DOWNWARD 1
+#else
+# define STACK_GROWS_DOWNWARD 0
+#endif
+
 /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
 #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
 
@@ -1974,11 +1983,12 @@ mem_overlaps_already_clobbered_arg_p (rtx addr, unsigned HOST_WIDE_INT size)
 return true;
   else
 i = INTVAL (val);
-#ifdef STACK_GROWS_DOWNWARD
-  i -= crtl-args.pretend_args_size;
-#else
-  i += crtl-args.pretend_args_size;
-#endif
+
+  if (STACK_GROWS_DOWNWARD)
+i -= crtl-args.pretend_args_size;
+  else
+i += crtl-args.pretend_args_size;
+
 
   if (ARGS_GROW_DOWNWARD)
 i = -i - size;
@@ -2908,12 +2918,13 @@ expand_call (tree exp, rtx target, int ignore)
   if (pass == 0)
 	{
 	  argblock = crtl-args.internal_arg_pointer;
-	  argblock
-#ifdef STACK_GROWS_DOWNWARD
-	= plus_constant (Pmode, argblock, crtl-args.pretend_args_size);
-#else
-	= plus_constant (Pmode, argblock, -crtl-args.pretend_args_size);
-#endif
+	  if (STACK_GROWS_DOWNWARD)
+	argblock
+	  = plus_constant (Pmode, argblock, crtl-args.pretend_args_size);
+	  else
+	argblock
+	  = plus_constant (Pmode, argblock, -crtl-args.pretend_args_size);
+
 	  stored_args_map = sbitmap_alloc (args_size.constant);
 	  bitmap_clear (stored_args_map);
 	}


Re: [RFA] More type narrowing in match.pd V2

2015-05-18 Thread Richard Biener
On Mon, May 18, 2015 at 11:17 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Thu, May 14, 2015 at 4:28 PM, Jeff Law l...@redhat.com wrote:
 On 05/14/2015 08:04 AM, Marc Glisse wrote:

 On Fri, 1 May 2015, Jeff Law wrote:

 Slight refactoring of the condition by using types_match as suggested
 by Richi.  I also applied the new types_match to 2 other patterns in
 match.pd where it seemed clearly appropriate.


 I would like to propose this small tweak (regtested ok). If we had a
 different type for trees and types, this would be overloading the
 function. We already do this in a few places, and I find the resulting
 shorter code more readable.

 2015-05-14  Marc Glisse  marc.gli...@inria.fr

  * generic-match-head.c (types_match): Handle non-types.
  * gimple-match-head.c (types_match): Likewise.
  * match.pd: Remove unnecessary TREE_TYPE for types_match.

 Update the comment for types_match and this is fine.

 No - it breaks genmatch autodetection of what trees escape.  It special-cases
 TREE_TYPE ().  See capture_info::walk_c_expr:

   /* Give up for C exprs mentioning captures not inside TREE_TYPE ().  */
   unsigned p_depth = 0;

 so the patch will pessimize GENERIC folding.

 Please do not apply.

Hmm, ok - if the functions are only ever used in if()s then it's ok.  You just
have to remember to never use them in (with ...) expressions or transform
results.  Like for example

 (with {
bool overflow_p;
wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), overflow_p);
   }

is bad because genmatch thinks that @1 and @2 might be used multiple
times in the simplification because they escape.  It doesn't have
sophisticated
enough analysis to see that mul is only used in a following 'if'.

I suppose we might want to emit warnings from genmatch.c whenever it has
to give up for multi-use (thus need to emit save_expr ()) or non-use
(need to use omit_one_operand if TREE_SIDE_EFFECTS, but as it doesn't
know whether it is used in the result we guard the whole pattern with
!TREE_SIDE_EFFECTS).

Richard.

 Richard.


 jeff



Re: [PATCH, PR middle-end/66134] Fix coalescing for bounds used in abnormal phi

2015-05-18 Thread Richard Biener
On Fri, May 15, 2015 at 11:43 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch fixes misuse of abnormal bounds copy to avoid coalescing issue.   
 Bootstrapped and regtested for x86_64-unknown-linux-gnu.  Applied to trunk.  
 Is it OK for gcc-5?

Hum.  I'm not sure how you can assert the value is defined by a
gimple-assign.  (what about PHIs?)  also see below

 Thanks,
 Ilya
 --
 gcc/

 2015-05-15  Ilya Enkovich  enkovich@gmail.com

 PR middle-end/66134
 * tree-chkp.c (chkp_get_orginal_bounds_for_abnormal_copy): New.
 (chkp_maybe_copy_and_register_bounds): Don't copy abnormal copy.

 gcc/testsuite/

 2015-05-15  Ilya Enkovich  enkovich@gmail.com

 PR middle-end/66134
 * gcc.target/i386/mpx/pr66134.c: New test.


 diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr66134.c 
 b/gcc/testsuite/gcc.target/i386/mpx/pr66134.c
 new file mode 100755
 index 000..3889674
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/mpx/pr66134.c
 @@ -0,0 +1,17 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx -fno-tree-ccp } */
 +
 +extern int vfork (void) __attribute__ ((__nothrow__ , __leaf__));
 +void test1 (void);
 +void test2 (void);
 +void test3 (int *);
 +
 +void test (int *p)
 +{
 + test1 ();
 + p++;
 + test2 ();
 + p++;
 + vfork ();
 + test3 (p);
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 288470b..17a52bc 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -1097,7 +1097,20 @@ chkp_get_bounds_var (tree ptr_var)
return bnd_var;
  }

 +/* If BND is an abnormal bounds copy, return a copied value.
 +   Otherwise return BND.  */
 +static tree
 +chkp_get_orginal_bounds_for_abnormal_copy (tree bnd)
 +{
 +  if (bitmap_bit_p (chkp_abnormal_copies, SSA_NAME_VERSION (bnd)))
 +{
 +  gimple bnd_def = SSA_NAME_DEF_STMT (bnd);
 +  gcc_checking_assert (gimple_code (bnd_def) == GIMPLE_ASSIGN);
 +  bnd = gimple_assign_rhs1 (bnd_def);
 +}

 +  return bnd;
 +}

  /* Register bounds BND for object PTR in global bounds table.
 A copy of bounds may be created for abnormal ssa names.
 @@ -1141,11 +1154,7 @@ chkp_maybe_copy_and_register_bounds (tree ptr, tree 
 bnd)
/* For abnormal copies we may just find original
  bounds and use them.  */
if (!abnormal_ptr  !SSA_NAME_IS_DEFAULT_DEF (bnd))
 -   {
 - gimple bnd_def = SSA_NAME_DEF_STMT (bnd);
 - gcc_checking_assert (gimple_code (bnd_def) == GIMPLE_ASSIGN);
 - bnd = gimple_assign_rhs1 (bnd_def);
 -   }
 +   bnd = chkp_get_orginal_bounds_for_abnormal_copy (bnd);
/* For undefined values we usually use none bounds
  value but in case of abnormal edge it may cause
  coalescing failures.  Use default definition of
 @@ -1177,6 +1186,7 @@ chkp_maybe_copy_and_register_bounds (tree ptr, tree bnd)
 copy = make_temp_ssa_name (pointer_bounds_type_node,
gimple_build_nop (),

Just pass NULL here to avoid creating a garbage NOP stmt.

CHKP_BOUND_TMP_NAME);
 + bnd = chkp_get_orginal_bounds_for_abnormal_copy (bnd);
   assign = gimple_build_assign (copy, bnd);

   if (dump_file  (dump_flags  TDF_DETAILS))


Re: Small fold-const.c / match.pd tweaks

2015-05-18 Thread Richard Biener
On Fri, May 15, 2015 at 12:52 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 these are just a few tweaks / clean-ups. The 'ord' pattern is the dual of
 the 'unord' one I added last month. The tree_unary_nonnegative_warnv_p
 change should have no effect (it is only used from VRP which does not handle
 vectors (yet?)) but it seems right to handle vectors like scalars here.

 Regtested on ppc64le-redhat-linux.

 2015-05-15  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/63387
 gcc/
 * match.pd ((X /[ex] A) * A - X): Remove unnecessary condition.
 ((x ord x)  (y ord y) - (x ord y),
 (x ord x)  (x ord y) - (x ord y)): New simplifications.
 * fold-const.c (tree_unary_nonnegative_warnv_p) ABS_EXPR: Handle
 vectors like scalars.
 gcc/testsuite/
 * gcc.dg/pr63387-2.c: New testcase.

 --
 Marc Glisse
 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c(revision 223214)
 +++ gcc/fold-const.c(working copy)
 @@ -14685,21 +14685,21 @@ tree_unary_nonnegative_warnv_p (enum tre
 bool *strict_overflow_p)
  {
if (TYPE_UNSIGNED (type))
  return true;

switch (code)
  {
  case ABS_EXPR:
/* We can't return 1 if flag_wrapv is set because
  ABS_EXPRINT_MIN = INT_MIN.  */
 -  if (!INTEGRAL_TYPE_P (type))
 +  if (!ANY_INTEGRAL_TYPE_P (type))
 return true;
if (TYPE_OVERFLOW_UNDEFINED (type))
 {
   *strict_overflow_p = true;
   return true;
 }
break;

  case NON_LVALUE_EXPR:
  case FLOAT_EXPR:
 Index: gcc/match.pd
 ===
 --- gcc/match.pd(revision 223214)
 +++ gcc/match.pd(working copy)
 @@ -791,22 +791,21 @@ along with GCC; see the file COPYING3.
  TYPE_PRECISION (type) = TYPE_PRECISION (TREE_TYPE (@0))
  operand_equal_p (@1, build_low_bits_mask (TREE_TYPE (@1),
 TYPE_PRECISION (type)),
 0))
 (convert @0)))


  /* (X /[ex] A) * A - X.  */
  (simplify
(mult (convert? (exact_div @0 @1)) @1)
/* Look through a sign-changing conversion.  */
 -  (if (TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
 -   (convert @0)))
 +  (convert @0))

Hmm, indeed.  Looks like we were overly cautionous (in forwprop where I moved
this from)

  /* Canonicalization of binary operations.  */

  /* Convert X + -C into X - C.  */
  (simplify
   (plus @0 REAL_CST@1)
   (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
(with { tree tem = fold_unary (NEGATE_EXPR, type, @1); }
 (if (!TREE_OVERFLOW (tem) || !flag_trapping_math)
  (minus @0 { tem; })
 @@ -944,22 +943,29 @@ along with GCC; see the file COPYING3.
  (icmp @0 @1))
 (if (ic == ncmp)
  (ncmp @0 @1)

  /* Unordered tests if either argument is a NaN.  */
  (simplify
   (bit_ior (unordered @0 @0) (unordered @1 @1))
   (if (types_match (@0, @1))
(unordered @0 @1)))
  (simplify
 + (bit_and (ordered @0 @0) (ordered @1 @1))
 + (if (types_match (@0, @1))
 +  (ordered @0 @1)))

 (for op (unordered ordered)
  op2 (bit_ior bit_and)
  (simplify
(op2 (op @0 @0) (op @1 @1))
(if ...
  (op @0 @1)))

would be a merged variant (nor sure if better - if the list increases
maybe).  I suppose
your version is easier to read and not very much bigger.

Patch is ok as-is.

Thanks,
Richard.

 +(simplify
   (bit_ior:c (unordered @0 @0) (unordered:c@2 @0 @1))
   @2)
 +(simplify
 + (bit_and:c (ordered @0 @0) (ordered:c@2 @0 @1))
 + @2)

  /* Simplification of math builtins.  */

  (define_operator_list LOG BUILT_IN_LOGF BUILT_IN_LOG BUILT_IN_LOGL)
  (define_operator_list EXP BUILT_IN_EXPF BUILT_IN_EXP BUILT_IN_EXPL)
  (define_operator_list LOG2 BUILT_IN_LOG2F BUILT_IN_LOG2 BUILT_IN_LOG2L)
  (define_operator_list EXP2 BUILT_IN_EXP2F BUILT_IN_EXP2 BUILT_IN_EXP2L)
  (define_operator_list LOG10 BUILT_IN_LOG10F BUILT_IN_LOG10 BUILT_IN_LOG10L)
  (define_operator_list EXP10 BUILT_IN_EXP10F BUILT_IN_EXP10 BUILT_IN_EXP10L)
  (define_operator_list POW BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
 Index: gcc/testsuite/gcc.dg/pr63387-2.c
 ===
 --- gcc/testsuite/gcc.dg/pr63387-2.c(revision 0)
 +++ gcc/testsuite/gcc.dg/pr63387-2.c(working copy)
 @@ -0,0 +1,26 @@
 +/* { dg-do compile } */
 +/* { dg-options -O -fdump-tree-optimized } */
 +
 +int f(double aaa, double bbb){
 +  int xa = !__builtin_isunordered(aaa, aaa);
 +  int xb = !__builtin_isunordered(bbb, bbb);
 +  return xa  xb;
 +}
 +
 +int g(double aaa, double bbb){
 +  int xa = !__builtin_isunordered(aaa, bbb);
 +  int xb = !__builtin_isunordered(bbb, bbb);
 +  return xa  xb;
 +}
 +
 +int h(double ccc, float ddd){
 +  int xc = !__builtin_isunordered(ccc, ccc);
 +  int xd = !__builtin_isunordered(ddd, ddd);
 +  return xc  xd;
 +}
 +
 

Re: match.pd: (x | y) ~x - y ~x

2015-05-18 Thread Richard Biener
On Fri, May 15, 2015 at 7:22 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 we already have the more complicated: x  ~(x  y) - x  ~y (which I am
 reindenting by the way) and the simpler: (~x | y)  x - x  y, so I am
 proposing this one for completeness. Regtested on ppc64le-redhat-linux.

Ok (doesn't seem to be in fold-const.c).

Btw, there are quite some (simple) ones only in fold-const.c which are eligible
for moving to match.pd (thus remove them from fold-const.c and implement in
match.pd).  Mostly canonicalization ones though.

Richard.

 2015-05-15  Marc Glisse  marc.gli...@inria.fr

 gcc/
 * match.pd ((x | y)  ~x - y  ~x, (x  y) | ~x - y | ~x): New
 simplifications.
 gcc/testsuite/
 * gcc.dg/nand.c: New testcase.

 --
 Marc Glisse
 Index: match.pd
 ===
 --- match.pd(revision 223217)
 +++ match.pd(working copy)
 @@ -257,24 +257,32 @@ along with GCC; see the file COPYING3.

  /* x + (x  1) - (x + 1)  ~1 */
  (simplify
   (plus:c @0 (bit_and@2 @0 integer_onep@1))
   (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
(bit_and (plus @0 @1) (bit_not @1

  /* x  ~(x  y) - x  ~y */
  /* x | ~(x | y) - x | ~y  */
  (for bitop (bit_and bit_ior)
 -  (simplify
 -(bitop:c @0 (bit_not (bitop:c@2 @0 @1)))
 -  (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
 -   (bitop @0 (bit_not @1)
 + (simplify
 +  (bitop:c @0 (bit_not (bitop:c@2 @0 @1)))
 +   (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
 +(bitop @0 (bit_not @1)
 +
 +/* (x | y)  ~x - y  ~x */
 +/* (x  y) | ~x - y | ~x */
 +(for bitop (bit_and bit_ior)
 + rbitop (bit_ior bit_and)
 + (simplify
 +  (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
 +  (bitop @1 @2)))

  (simplify
   (abs (negate @0))
   (abs @0))
  (simplify
   (abs tree_expr_nonnegative_p@0)
   @0)


  /* Try to fold (type) X op CST - (type) (X op ((type-x) CST))
 Index: testsuite/gcc.dg/nand.c
 ===
 --- testsuite/gcc.dg/nand.c (revision 0)
 +++ testsuite/gcc.dg/nand.c (working copy)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O -fdump-tree-original } */
 +
 +unsigned f(unsigned x, unsigned y){
 +  return (x | y)  ~x;
 +}
 +unsigned g(unsigned x, unsigned y){
 +  return ~x  (y | x);
 +}
 +
 +/* { dg-final { scan-tree-dump-times return ~x  y; 2 original } } */
 +/* { dg-final { cleanup-tree-dump original } } */



  1   2   >