Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-20 Thread Jason Merrill

On 11/19/2015 03:46 PM, Jason Merrill wrote:

On 11/15/2015 12:01 AM, David Malcolm wrote:

As with the C frontend, there's an issue with tree nodes that
don't have locations: VAR_DECL, INTEGER_CST, etc:

   int test (int foo)
   {
 return foo * 100;
^^^   ^^^
   }

where we'd like to access the source spelling ranges of the expressions
during parsing, so that we can use them when reporting parser errors.


Hmm, I had been thinking to address this in the C++ front end by
wrapping uses in another tree: NOP_EXPR for rvalues, VIEW_CONVERT_EXPR
for lvalues.


On the other hand, my direction seems likely to cause more issues, 
especially with code that doesn't yet know how to handle 
VIEW_CONVERT_EXPR, and could create ambiguity with explicit conversions. 
 So I guess your approach seems reasonable.


What is the memory consumption impact of this change?


Also, in cp_parser_new_expression I attempted to generate meaningful
ranges e.g.:

  int *foo = new int[100];
 ^~~~

but it seems to be hard to do this, due to the trailing optional
components; I found myself wanting to ask the lexer for the last
token it consumed (in particular, I'm interested in the
location_t of that token).  Is this a reasonable thing to add to the
lexer?


cp_lexer_previous_token seems like what you want.


-  return cp_build_unary_op (code, arg1, candidates != 0, complain);
+  {
+   tree result = cp_build_unary_op (code, arg1, candidates != 0, complain);
+   protected_set_expr_location (result, loc);


I'd much rather pass loc into cp_build_unary_op.  At least add a FIXME 
to that effect.  Likewise in the other places you call *build* and then 
set the loc.



+#if 0
+/* FIXME: various assertions can be put in here when debugging,
+   for tracking down where location information gets thrown
+   away (during a trip through a purely "tree" value).  */
+if (m_value && m_value != error_mark_node)
+  {
+   if (TREE_CODE (m_value) == FUNCTION_DECL)
+ return; // for now
+   gcc_assert (CAN_HAVE_LOCATION_P (m_value));
+   //gcc_assert (m_loc != UNKNOWN_LOCATION);


Yeah, I don't think you want to assert on UNKNOWN_LOCATION; some code 
does not and should not have an associated location.  In particular, 
cleanups.



 Build the assignment expression.  Its default
-location is the location of the '=' token.  */
+location:
+  LHS = RHS
+  ^
+is the location of the '=' token as the
+caret, ranging from the start of the lhs to the
+end of the rhs.  */
  saved_input_location = input_location;
+ loc = make_location (loc,
+  expr.get_start (),
+  rhs.get_finish ());
  input_location = loc;
  expr = build_x_modify_expr (loc, expr,
  assignment_operator,
  rhs,
  complain_flags (decltype_p));
+ protected_set_expr_location (expr, loc);
  input_location = saved_input_location;


Do we still need to mess with input_location here?  If so, please add a 
FIXME explaining what still needs to be fixed.


Jason



Re: [PATCH] Fix debug info related ICE when inlining back fnsplit function (PR debug/66432)

2015-11-20 Thread Richard Biener
On November 20, 2015 9:11:01 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>This patch fixes ICE where a parameter mentioned in a debug source bind
>has its VLA type mistakenly remapped and that leads to inconsistent
>type
>between the PARM_DECL and SSA_NAMEs derived from it.
>
>The patch Tom posted for this can't work, because we assume that the
>s=> value as well as debug_decl_args decl in that case is DECL_ORIGIN
>of the PARM_DECL.  But, in this case we are replacing the DECL_ORIGIN
>PARM_DECL immediately with a DEBUG_EXPR_DECL anyway, so there is no
>point remapping the var we don't own (it could be in a different
>function
>etc.).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/5.3?

OK.
Thanks,
Richard.

>2015-11-20  Jakub Jelinek  
>
>   PR debug/66432
>   * tree-inline.c (copy_debug_stmt): If
>   gimple_debug_source_bind_get_value is DECL_ORIGIN of a PARM_DECL
>   in decl_debug_args, don't call remap_gimple_op_r on it.
>
>   * gcc.dg/debug/pr66432.c: New test.
>
>--- gcc/tree-inline.c.jj   2015-11-14 19:36:03.0 +0100
>+++ gcc/tree-inline.c  2015-11-20 17:17:00.632082622 +0100
>@@ -2864,8 +2864,6 @@ copy_debug_stmt (gdebug *stmt, copy_body
>   else if (gimple_debug_source_bind_p (stmt))
> {
>   gimple_debug_source_bind_set_var (stmt, t);
>-  walk_tree (gimple_debug_source_bind_get_value_ptr (stmt),
>-   remap_gimple_op_r, &wi, NULL);
>   /* When inlining and source bind refers to one of the optimized
>away parameters, change the source bind into normal debug bind
>referring to the corresponding DEBUG_EXPR_DECL that should have
>@@ -2889,7 +2887,10 @@ copy_debug_stmt (gdebug *stmt, copy_body
>   break;
> }
>   }
>-  }  
>+  }
>+  if (gimple_debug_source_bind_p (stmt))
>+  walk_tree (gimple_debug_source_bind_get_value_ptr (stmt),
>+ remap_gimple_op_r, &wi, NULL);
> }
> 
>   processing_debug_stmt = 0;
>--- gcc/testsuite/gcc.dg/debug/pr66432.c.jj2015-11-20
>17:40:44.589171083 +0100
>+++ gcc/testsuite/gcc.dg/debug/pr66432.c   2015-11-20 17:38:48.0
>+0100
>@@ -0,0 +1,19 @@
>+/* PR debug/66432 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -g" } */
>+
>+extern void baz (const char *, const char *) __attribute__
>((__noreturn__));
>+
>+void
>+foo (int x, int y[x][x])
>+{
>+  if (x < 2)
>+baz ("", "");
>+}
>+
>+void
>+bar (void)
>+{
>+  int z[2][2] = { { 1, 2 }, { 3, 4 } };
>+  foo (2, z);
>+}
>
>   Jakub




Re: RFA: PATCH to match.pd for c++/68385

2015-11-20 Thread Richard Biener
On November 20, 2015 8:58:15 PM GMT+01:00, Jason Merrill  
wrote:
>In this bug, we hit the (A & sign-bit) != 0 -> A < 0 transformation. 
>Because of delayed folding, the operands aren't fully folded yet, so we
>
>have NOP_EXPRs around INTEGER_CSTs, and so calling wi::only_sign_bit_p 
>ICEs.  We've been seeing several similar bugs, where code calls 
>integer_zerop and therefore assumes that they have an INTEGER_CST, but 
>in fact integer_zerop does STRIP_NOPS.
>
>This patch changes the pattern to only match if the operand is actually
>
>an INTEGER_CST.  Alternatively we could call tree_strip_nop_conversions
>
>on the operand, but I would expect that to have issues when the 
>conversion changes the signedness of the type.
>
>OK if testing passes?

What happens if we remove the nops stripping from integer_zerop?  Do other 
integer predicates strip nops?

Richard.




[PATCH] 23_containers/vector/profile/vector.cc

2015-11-20 Thread David Edelsohn
The testcase allocates so much memory that it requires an additional
option to enabled higher memory limit on AIX.

Bootstrapped on powerpc-ibm-aix7.1.0.0

Thanks, David

* testsuite/23_containers/vector/profile/vector.cc: Add maxdata option on AIX.

Index: 23_containers/vector/profile/vector.cc
===
--- 23_containers/vector/profile/vector.cc  (revision 230674)
+++ 23_containers/vector/profile/vector.cc  (working copy)
@@ -2,6 +2,8 @@
 // Advice: set tmp as 1

 // { dg-options "-DITERATIONS=20" { target simulator } }
+// AIX requires higher memory limit
+// { dg-additional-options "-Wl,-bmaxdata:0x2000" { target {
powerpc-ibm-aix* } } }

 #ifndef ITERATIONS
 #define ITERATIONS 2000


libgo patch committed: Fix offset handling in syscall.Sendfile

2015-11-20 Thread Ian Lance Taylor
PR 66378 points out that syscall.Sendfile does not correct handle an
offset argument: it ignored the initial value.  This patch fixes the
problem.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline and GCC 5 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230695)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-81dcb1ba4de82a6c9325cb322d5a832a6b1f168d
+97ec885c715b3922b0866c081554899b8d50933a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/syscall/libcall_bsd.go
===
--- libgo/go/syscall/libcall_bsd.go (revision 230463)
+++ libgo/go/syscall/libcall_bsd.go (working copy)
@@ -17,6 +17,7 @@ func Sendfile(outfd int, infd int, offse
var soff Offset_t
var psoff *Offset_t
if offset != nil {
+   soff = Offset_t(*offset)
psoff = &soff
}
written, err = sendfile(outfd, infd, psoff, count)
Index: libgo/go/syscall/libcall_linux.go
===
--- libgo/go/syscall/libcall_linux.go   (revision 230463)
+++ libgo/go/syscall/libcall_linux.go   (working copy)
@@ -327,6 +327,7 @@ func Sendfile(outfd int, infd int, offse
var soff Offset_t
var psoff *Offset_t
if offset != nil {
+   soff = Offset_t(*offset)
psoff = &soff
}
written, err = sendfile(outfd, infd, psoff, count)


Re: [PATCH,RFC] Introduce RUN_UNDER_VALGRIND in test-suite

2015-11-20 Thread Hans-Peter Nilsson
On Thu, 19 Nov 2015, Martin Li?ka wrote:
> Hello.
>
> In last two weeks I've removed couple of memory leaks, mainly tight to 
> middle-end.
> Currently, a user of the GCC compiler can pass '--enable-checking=valgrind' 
> configure option
> that will run all commands within valgrind environment, but as the valgrind 
> runs just with '-q' option,
> the result is not very helpful.
>
> I would like to start with another approach, where we can run all tests in 
> test-suite
> within the valgrind sandbox and return an exit code if there's an error seen 
> by the tool.
> That unfortunately leads to many latent (maybe false positives, FE issues, 
> ...) that can
> be efficiently ignored by valgrind suppressions file (the file is part of 
> suggested patch).
>
> The first version of the valgrind.supp can survive running compilation of 
> tramp3d with -O2
> and majority of tests in test-suite can successfully finish. Most of memory 
> leaks
> mentioned in the file can be eventually fixed.

I didn't quite understand the need for the suppression files.
Is it like Markus said, only because valgrind annotations are
not on by default?  Then let's change it so that's the default
during DEV-PHASE = experimental (the development phase) or
prerelease.  I really thought that was the case by now.
(The suppression files are IMHO a useful addition to contrib/
either way.)

> As I noticed in results log files, most of remaining issues are connected to 
> gcc.c and
> lto-wrapper.c files. gcc.c heavily manipulates with strings and it would 
> probably require
> usage of a string pool, that can easily eventually removed (just in case of 
> --enable-valgrind-annotations).
> The second source file tends to produce memory leaks because of fork/exec 
> constructs. However both
> can be improved during next stage1.
>
> Apart from aforementioned issues, the compiler does not contain so many 
> issues and I think it's
> doable to prune them and rely on reported valgrind errors.
>
> Patch touches many .exp files, but basically does just couple of 
> modifications:
>
> 1) gcc-defs.exp introduces new global variable run_under_valgrind
> 2) new procedure dg-run-valgrind distinguishes between just passing options 
> to 'gd-test',
>or runs 'dg-test' with additional flags that enable valgrind (using 
> -wrapper)
> 3) new procedure dg-runtest-valgrind does the similar
> 4) many changes in corresponding *.exp files that utilize these procedures
>
> The patch should be definitely part of next stage1, but I would appreciate 
> any thoughts
> about the described approach?

IIRC you can replace the actual dg-runtest proc with your own
(implementing a wrapper).  Grep aroung, I think we do that
already.  That's certainly preferable instead of touching all
callers.

>
> Thank you,
> Martin

brgds, H-P


ICF fixes

2015-11-20 Thread Jan Hubicka
Hi,
this patchs fixes few issues in ipa-icf.  First I drop the use of TYPE_CANONICAL
because that is no longer part of type compatibility machinery.
Second I also hash TYPE_MODE for aggregates, becuase we now always require a 
match
and I check that we don't match types that are incomplete where this would give
false negatives.

Second change is in func_checker::compatible_types_p where I disabled comparing
of alias sets of types that do not have alias sets (and thus can never be read).
I hoped to switch ipa-icf-gimple.c to use of operand_equal_p and drop all such
wrong type mathcing, but I suppose it will need to wait for next stage1. 

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* ipa-icf.c (sem_item::add_type): Do not look for TYPE_CANONICAL;
do not check AGGREGATE_TYPE_P when adding TYPE_MODE;
Check that all record types are complete.
* ipa-icf-gimple.c (func_checker::compatible_types_p): Do not
compare alias sets for types w/o alias sets.
Index: ipa-icf.c
===
--- ipa-icf.c   (revision 230683)
+++ ipa-icf.c   (working copy)
@@ -1543,11 +1543,8 @@ sem_item::add_type (const_tree type, inc
 }
 
   type = TYPE_MAIN_VARIANT (type);
-  if (TYPE_CANONICAL (type))
-type = TYPE_CANONICAL (type);
 
-  if (!AGGREGATE_TYPE_P (type))
-hstate.add_int (TYPE_MODE (type));
+  hstate.add_int (TYPE_MODE (type));
 
   if (TREE_CODE (type) == COMPLEX_TYPE)
 {
@@ -1574,6 +1571,7 @@ sem_item::add_type (const_tree type, inc
 }
   else if (RECORD_OR_UNION_TYPE_P (type))
 {
+  gcc_checking_assert (COMPLETE_TYPE_P (type));
   hashval_t *val = optimizer->m_type_hash_cache.get (type);
 
   if (!val)
Index: ipa-icf-gimple.c
===
--- ipa-icf-gimple.c(revision 230683)
+++ ipa-icf-gimple.c(working copy)
@@ -233,7 +233,15 @@ func_checker::compatible_types_p (tree t
   if (!types_compatible_p (t1, t2))
 return return_false_with_msg ("types are not compatible");
 
-  if (get_alias_set (t1) != get_alias_set (t2))
+  /* We do a lot of unnecesary matching of types that are not being
+ accessed and thus do not need to be compatible.  In longer term we should
+ remove these checks on all types which are not accessed as memory
+ locations.
+
+ For time being just avoid calling get_alias_set on types that are not
+ having alias sets defined at all.  */
+  if (type_with_alias_set_p (t1) && type_with_alias_set_p (t2)
+  && get_alias_set (t1) != get_alias_set (t2))
 return return_false_with_msg ("alias sets are different");
 
   return true;


Go testsuite patch committed: Skip nilptr.go if PIE

2015-11-20 Thread Ian Lance Taylor
This patch to the Go testsuite skips the nilptr.go test when the
default is to produce PIE.  This should fix GCC PR 66406.  Ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2015-11-20  Ian Lance Taylor  

PR go/66406
* go.test/go-test.exp (go-gc-tests): Skip nilptr.go if PIE.
Index: go.test/go-test.exp
===
--- go.test/go-test.exp (revision 229630)
+++ go.test/go-test.exp (working copy)
@@ -398,6 +398,11 @@ proc go-gc-tests { } {
}
}
 
+   if [check_effective_target_pie_enabled] {
+   untested $test
+   continue
+   }
+
if { [file tail $test] == "init1.go" } {
# This tests whether GC runs during init, which for gccgo
# it currently does not.


libgo patch committed: Don't run multicast tests on nil interface if -test.short

2015-11-20 Thread Ian Lance Taylor
This libgo patch is a backport of https://golang.org/cl/17154.  It
fixes the tests for the net package to not run the multicast tests on
a nil interface when using -test.short, which is the default when
running make check.  This should fix GCC PR 65785.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline and
GCC 5 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230694)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-b839c8c35af49bd6d86306ad34449654a4657cb1
+81dcb1ba4de82a6c9325cb322d5a832a6b1f168d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/net/listen_test.go
===
--- libgo/go/net/listen_test.go (revision 230463)
+++ libgo/go/net/listen_test.go (working copy)
@@ -542,7 +542,7 @@ func TestIPv4MulticastListener(t *testin
// routing stuff for finding out an appropriate
// nexthop containing both network and link layer
// adjacencies.
-   if ifi == nil && !*testExternal {
+   if ifi == nil && (testing.Short() || !*testExternal) {
continue
}
for _, tt := range ipv4MulticastListenerTests {
@@ -618,7 +618,7 @@ func TestIPv6MulticastListener(t *testin
// routing stuff for finding out an appropriate
// nexthop containing both network and link layer
// adjacencies.
-   if ifi == nil && (!*testExternal || !*testIPv6) {
+   if ifi == nil && (testing.Short() || !*testExternal || 
!*testIPv6) {
continue
}
for _, tt := range ipv6MulticastListenerTests {


libgo: use clock_gettime to get current time

2015-11-20 Thread Ian Lance Taylor
PR 66574 aka https://golang.org/issue/11222 points out that libgo is
only retrieving the current time in microseconds, when it should be
using nanoseconds.  This patch fixes the problem.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230689)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-f79db38cf3484b63f7807abef05eecb23e9d0806
+b839c8c35af49bd6d86306ad34449654a4657cb1
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/configure.ac
===
--- libgo/configure.ac  (revision 230463)
+++ libgo/configure.ac  (working copy)
@@ -501,9 +501,10 @@ PTHREAD_LIBS=
 AC_CHECK_LIB([pthread], [pthread_create], PTHREAD_LIBS=-lpthread)
 AC_SUBST(PTHREAD_LIBS)
 
-dnl Test if -lrt is required for sched_yield and/or nanosleep.
+dnl Test if -lrt is required for sched_yield or nanosleep or clock_gettime.
 AC_SEARCH_LIBS([sched_yield], [rt])
 AC_SEARCH_LIBS([nanosleep], [rt])
+AC_SEARCH_LIBS([clock_gettime], [rt])
 
 AC_C_BIGENDIAN
 
Index: libgo/runtime/go-now.c
===
--- libgo/runtime/go-now.c  (revision 230463)
+++ libgo/runtime/go-now.c  (working copy)
@@ -13,11 +13,11 @@
 struct time_now_ret
 now()
 {
-  struct timeval tv;
+  struct timespec ts;
   struct time_now_ret ret;
 
-  gettimeofday (&tv, NULL);
-  ret.sec = tv.tv_sec;
-  ret.nsec = tv.tv_usec * 1000;
+  clock_gettime (CLOCK_REALTIME, &ts);
+  ret.sec = ts.tv_sec;
+  ret.nsec = ts.tv_nsec;
   return ret;
 }


[PATCH/RFC v2] PR68212: Improve Accounting of Block Frequencies During Loop Unrolling

2015-11-20 Thread Kelvin Nilsen
The problem addressed by this patch is that the intermediate code 
produced during loop unrolling has incorrect block frequencies.  The 
erroneous block frequencies result because block frequencies are not 
adjusted to account for the execution contexts into which they are 
copied.  These incorrect frequencies disable and/or confuse subsequent 
compiler optimizations.  The general idea of how we address this problem 
is two fold:


  1. Before a loop body is unpeeled into a pre-header location, we 
temporarily adjust the loop body frequencies to represent the values 
appropriate for the context into which the loop body is to be copied.


  2. After unrolling the loop body (by replicating the loop body (N-1) 
times within the loop), we recompute all frequencies associated with 
blocks contained within the loop.


This revision of the patch distributed earlier (See 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00814.html) addresses 
various concerns raised as feedback in response to the original post.  
This new patch includes the following improvements:


1. Various formatting and style improvements:
a) Comments describing functions use all-caps naming of arguments, 
do not start continuation lines with an asterisk, and end with a comment 
termination sequence ("*/") on the same line as the comment text.
b) Compound statements (enclosed within braces) consisting of only 
one simple statement are now represented as single statements (without 
the braces).
c) Gratuitous changes in white space between old and new code 
revisions have been removed.

d) Several obvious and/or redundant comments have been removed.
e) Comments previously tagged to the end of a C statement have been 
moved to the line that precedes the statement.
f) The ChangeLog has been rewritten and reformatted according to 
guidelines provided in the "GNU Coding Standards"


 2. Certain functions have been rewritten and/or removed:
 a) in_loop_p() was written to use bb->loop_father and 
flow_loop_nested_p() instead of iteration over all blocks in the loop.
 b) same_edge_p() was removed.  All invocations of this function 
are replaced with pointer equality tests.

 c) in_loop_of_header_p() was removed as it was deemed unnecessary.
 d) in_block_set_p() was removed as it was deemed unnecessary.
 e) get_exit_edges_from_loop_blocks() was removed as it was deemed 
unnecessary.
 f) internal() was removed as it was deemed redundant with 
fatal_error().


 3. Improvements include:
 a) A set of 15 test programs has been added into the 
testsuite/gcc.dg/ subdirectory.  Some of these test programs still fail 
and work is ongoing to identify and address the reasons for the failures 
. In several cases, it is known that the failures result from block 
frequency inaccuracies that originate outside of the loop unrolling code.
 b) Where concerns were raised about the potential for unbounded 
recursion in some of the functions, comments now clarify the bound on 
the recursion depth.
 c) Code that had been conditionally compiled under the 
ENABLE_CHECKING attribute is now unconditionally compiled and executed 
only if the flag_checking variable is non-zero.


gcc/ChangeLog:

2015-11-20  Kelvin Nilsen  

* loop-unroll.c (unroll_loop_constant_iterations): After
replicating the loop body within a loop, recompute the
frequencies for all blocks contained within the loop.
(unroll_loop_runtime_iterations): Before copying loop
body to preheader location, temporarily adjust the loop body
frequencies to represent the context into which the loop body will
be copied.  After replicating the loop body within a loop,
recompute the frequencies for all blocks contained within the
loop.

* cfgloopmanip.c (in_loop_p): New helper function.
(zero_loop_frequencies): New helper function.
(in_edge_set_p): New helper function.
(in_call_chain_p): New helper function.
(recursively_zero_frequency): New helper function.
(recursion_detected_p): New helper function.
(zero_partial_loop_frequencies): New helper function.
(recursively_increment_frequency): New helper function.
(increment_loop_frequencies): New helper function.
(check_loop_frequency_integrity): New helper function.
(can_duplicate_loop_p): New helper function.
(duplicate_loop_to_header_edge): Recompute loop frequencies
after blocks are replicated (unrolled) into the loop body.

* cfgloopmanip.h: New extern declarations.


gcc/testsuite/ChangeLog:

2015-11-20  Kelvin Nilsen  

* gcc.dg/pr68212-1.c: New test.
* gcc.dg/pr68212-10.c: New test.
* gcc.dg/pr68212-11.c: New test.
* gcc.dg/pr68212-12.c: New test.
* gcc.dg/pr68212-13.c: New test.
* gcc.dg/pr68212-14.c: New test.
* gcc.dg/pr68212-15.c: New test.
* gcc.dg/pr68212-2.c: New test.
* gcc.dg/pr68212-3.c: New test.
* gcc.dg/pr68212-4.c: New test.
* gcc.d

Re: [PATCH, rs6000, gcc 5 backport] Fix PR target/67808, LRA ICE on double to long double conversion

2015-11-20 Thread Michael Meissner
On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote:
> PR67808 exposes a problem with the constraints in the *extenddftf2_internal
> pattern, in that it allows TFmode operands to occupy Altivec registers
> which they are not allowed to do.  Reload was able to work around the
> problem, but LRA is more pedantic and it caused it to go into an infinite
> spill loop until it ICEd.  The following patch from Mike changes the TFmode
> output operand to use the "d" constraint instead of "ws".  It also allows
> using the "ws" constraint for the two input operands, since that is allowed
> for DFmode operands.
> 
> This passed bootstraps (with reload on by default and lra on by default)
> and shows no testsuite regressions.  Is this ok for trunk?
> 
> The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for
> that too, assuming my bootstrap/regtesting there are clean?

The following patch backports the fix to GCC 5.x.  There were no regressions in
doing the bootstrap/make check on both a big endian power7 system and a little
endian power8 system.  Is it ok to apply the patch to the gcc-5 branch?

2015-10-20  Michael Meissner  

Back port from trunk:
2015-10-05  Michael Meissner  
Peter Bergner  

PR target/67808
* config/rs6000/rs6000.md (extenddftf2): In the expander, only
allow registers, but provide insns for the combiner to create for
loads from memory. Separate VSX code from non-VSX code. For
non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename
externaldftf2_internal to externaldftf2_fprs. Reorder constraints
so that registers come before memory operations. Drop support from
converting DFmode to TFmode, if the DFmode value is in a GPR
register.
(extenddftf2_fprs): Likewise.
(extenddftf2_internal): Likewise.
(extenddftf2_vsx): Likewise.
(extendsftf2): In the expander, only allow registers, but provide
insns for the combiner to create for stores and loads.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-20 Thread H.J. Lu
On Fri, Nov 20, 2015 at 2:17 PM, Jason Merrill  wrote:
> On 11/20/2015 01:52 PM, H.J. Lu wrote:
>>
>> On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
>>  wrote:
>>>
>>> On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:

 Empty record should be returned and passed the same way in C and C++.
 This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
 defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is defined
 to is_really_empty_class, which returns true for C++ empty classes.  For
 LTO, we stream out a bit to indicate if a record is empty and we store
 it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
 changed to set bitsize to 0 for empty records.  Middle-end and x86
 backend are updated to ignore empty records for parameter passing and
 function value return.  Other targets may need similar changes.
>>>
>>>
>>> Please avoid a new langhook for this and instead claim a bit in
>>> tree_type_common
>>> like for example restrict_flag (double-check it is unused for
>>> non-pointers).
>>
>>
>> There is no bit in tree_type_common I can overload.  restrict_flag is
>> checked for non-pointers to issue an error when it is used on
>> non-pointers:
>>
>>
>> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
>> error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
>> typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
>> qualifiers cannot" "" }
>
>
> The C++ front end only needs to check TYPE_RESTRICT for this purpose on
> front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals could
> handle that specifically if you change TYPE_RESTRICT to only apply to
> pointers.
>

restrict_flag is also checked in this case:

[hjl@gnu-6 gcc]$ cat x.i
struct dummy { };

struct dummy
foo (struct dummy __restrict__ i)
{
  return i;
}
[hjl@gnu-6 gcc]$ gcc -S x.i -Wall
x.i:4:13: error: invalid use of ‘restrict’
 foo (struct dummy __restrict__ i)
 ^
x.i:4:13: error: invalid use of ‘restrict’
[hjl@gnu-6 gcc]$

restrict_flag can't also be used to indicate `i' is an empty record.


H.J.


Go patch committed: Fix minor performance problem in import-archive

2015-11-20 Thread Ian Lance Taylor
GCC PR 68141 points out a minor performance problem in the code in
gcc/go/gofrontend/import-archive.cc: in the Archive_iterator
comparison methods, the type is not passed by reference.  This patch
fixes it.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230685)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-128d5b14b8ab967cb61c01a9b2c596bda7d04c63
+f79db38cf3484b63f7807abef05eecb23e9d0806
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/import-archive.cc
===
--- gcc/go/gofrontend/import-archive.cc (revision 230463)
+++ gcc/go/gofrontend/import-archive.cc (working copy)
@@ -468,11 +468,11 @@ class Archive_iterator
   }
 
   bool
-  operator==(const Archive_iterator p) const
+  operator==(const Archive_iterator& p) const
   { return this->off_ == p->off; }
 
   bool
-  operator!=(const Archive_iterator p) const
+  operator!=(const Archive_iterator& p) const
   { return this->off_ != p->off; }
 
  private:


libgo patch committed: Fix PR 67976 on GCC 5 branch

2015-11-20 Thread Ian Lance Taylor
This patch to libgo fixes PR 67976 on the GCC 5 branch.  The bug was
already fixed on mainline when libgo was upgraded to Go 1.5.

Ian
Index: libgo/go/cmd/cgo/out.go
===
--- libgo/go/cmd/cgo/out.go (revision 229639)
+++ libgo/go/cmd/cgo/out.go (working copy)
@@ -102,10 +102,9 @@ func (p *Package) writeDefs() {
}
 
if !cVars[n.C] {
-   fmt.Fprintf(fm, "extern char %s[];\n", n.C)
-   fmt.Fprintf(fm, "void *_cgohack_%s = %s;\n\n", n.C, n.C)
-
if !*gccgo {
+   fmt.Fprintf(fm, "extern char %s[];\n", n.C)
+   fmt.Fprintf(fm, "void *_cgohack_%s = %s;\n\n", 
n.C, n.C)
fmt.Fprintf(fc, "#pragma cgo_import_static 
%s\n", n.C)
}
 


Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-20 Thread Joseph Myers
On Fri, 20 Nov 2015, David Malcolm wrote:

> The source ranges are verified using the same unit-testing plugin used
> for C expressions.  This leads to a wart, which is that it means having
> a .m test file lurking below gcc.dg/plugin.  A workaround would be to
> create an objc.dg/plugin subdirectory, with a new plugin.exp for testing
> Objective C plugins, and having it reference the existing plugin below
> gcc.dg/plugin.  That seemed like excessive complication, so I went for
> the simpler approach of simply putting the .m file below gcc.dg/plugin.

Have you made sure that this test is quietly not run if the 
--enable-languages configuration does not build the ObjC compiler?

-- 
Joseph S. Myers
jos...@codesourcery.com


libgo patch committed: Handle DW_AT_specification in cgo

2015-11-20 Thread Ian Lance Taylor
The earlydebug work has caused https://gcc.gnu.org/PR68072 when using
the cgo tool.  The patch to fix this in the master sources is
https://golang.org/cl/17151 .  This patch fixes the problem in the
gccgo sources.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline and GCC 5 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230677)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-d52835c9376985f92f35c32af5f1808239981536
+128d5b14b8ab967cb61c01a9b2c596bda7d04c63
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/cgo/gcc.go
===
--- libgo/go/cmd/cgo/gcc.go (revision 230463)
+++ libgo/go/cmd/cgo/gcc.go (working copy)
@@ -490,6 +490,11 @@ func (p *Package) loadDWARF(f *File, nam
name, _ := e.Val(dwarf.AttrName).(string)
typOff, _ := e.Val(dwarf.AttrType).(dwarf.Offset)
if name == "" || typOff == 0 {
+   if e.Val(dwarf.AttrSpecification) != nil {
+   // Since we are reading all the DWARF,
+   // assume we will see the variable 
elsewhere.
+   break
+   }
fatalf("malformed DWARF TagVariable entry")
}
if !strings.HasPrefix(name, "__cgo__") {


Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Jeff Law

On 11/20/2015 11:04 AM, Manuel López-Ibáñez wrote:

On 20 November 2015 at 17:42, Jeff Law  wrote:

So we have to detangle the operand shortening from warning detection. Kai's
idea was to first make the shortening code "pure" in the sense that it would
have no side effects other than to generate the warnings.  Canonicalization
and other transformations would still occur internally, but not be reflected
in the IL.

That was the overall plan and he posted a patch for that.  But that patch
didn't do the due diligence to verify that once the shortening code was made
"pure" that we didn't regress on the quality of the code we generated.


I thought that the original plan was to make the warning code also use
match.pd. That is, that all folding, including FE folding, will be
match.pd-based. Has this changed?

I don't think that's changed.

Detangling the two is the first step. Once detangled we can then look to 
move the warning to a more suitable location -- right now it's in the 
C/C++ front-ends, firing way too early.  Instead it ought to be checked 
in gimple form, after match.pd canonicalization and simplifications.


Jeff


Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-20 Thread Jeff Law

On 11/18/2015 07:15 AM, Nick Clifton wrote:

Hi Guys,

   I recently discovered a bug in the current Redundant Extension
   Elimination optimization.  If the candidate extension instruction
   increases the number of hard registers used, the pass does not check
   to see if these extra registers are live between the definition and
   the extension.

   For example:

 (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
 (insn  45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
 [...]
 (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
 [...]
 (insn  88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))

   (This is on the RL78 target where HImode values occupy two hard
   registers and QImode values only one.  The bug however is generic, not
   RL78 specific).

   The REE pass transforms this into:

 (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
 (insn  45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18
 [...]
 (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
 [...]
 (insn  88 deleted)

   Note how the new set at insn 45 clobbers the value loaded by insn 44
   into r11.

   The patch below is my attempt to fix this.  It is not correct
   however.  I could not work out how to determine if a given hard
   register is live at any point between two insns.  So instead I used
   the liveness information associated with the basic block containing
   the definition.  If one of the extended registers is live or becomes
   live in this block, and this block is not the same block as the one
   containing the extension insn, then I stop the optimization.  This
   works for the RL78 for all of the cases that I could find.

   So ... comments please ?

   Will gcc ever generate a single basic block that contains both the
   definition and the extension instructions, but also where the extra
   hard registers are used for another purpose as well ?

   Tested with no regressions (or fixes) on an x86-pc-linux-gnu target.
   Also tested with no regression and 7 fixes on an rl78-elf target.

Cheers
   Nick

gcc/ChangeLog
2015-11-18  Nick Clifton  

* ree.c (regs_live_between): New function.
 (add_removable_extension): If the extension uses more hard
 registers than the definition then check that the extra hard
 registers are not live between the definition and the
 extension.

@@ -1080,6 +1123,30 @@
  }
  }

+  /* Fourth, if the extended version occupies more registers than
+the original version then make sure that these extra registers
+are not live between the definition and the extension.  */
+  if (HARD_REGNO_NREGS (REGNO (dest), mode)
+ > HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
So this seems like a reasonable place to catch this.  I certainly prefer 
to avoid work later by filtering earlier like you've done.


The problem is with your implementation of regs_live_between.  You're 
assuming we've got straight-line code between the original set and the 
extension.  That's not guaranteed.  Essentially we've got use-def chains 
via the DF framework.  So, in theory, they can be in different blocks 
and arbitrary flow control between them.  In fact, the extension could 
be before the original set in the insn chain (though obviously not for 
execution order).


I think it's reasonable to just punt when we see that the source/dest 
register of the extension are the same and the number of hard regs is 
different.  That's what I'm testing now.


jeff




Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-20 Thread Jason Merrill

On 11/20/2015 01:52 PM, H.J. Lu wrote:

On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
 wrote:

On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:

Empty record should be returned and passed the same way in C and C++.
This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is defined
to is_really_empty_class, which returns true for C++ empty classes.  For
LTO, we stream out a bit to indicate if a record is empty and we store
it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
changed to set bitsize to 0 for empty records.  Middle-end and x86
backend are updated to ignore empty records for parameter passing and
function value return.  Other targets may need similar changes.


Please avoid a new langhook for this and instead claim a bit in tree_type_common
like for example restrict_flag (double-check it is unused for non-pointers).


There is no bit in tree_type_common I can overload.  restrict_flag is
checked for non-pointers to issue an error when it is used on non-pointers:

/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
qualifiers cannot" "" }


The C++ front end only needs to check TYPE_RESTRICT for this purpose on 
front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals 
could handle that specifically if you change TYPE_RESTRICT to only apply 
to pointers.


Jason




Re: [PATCH] g++.dg/init/vbase1.C and g++.dg/cpp/ucn-1.C

2015-11-20 Thread Mike Stump
On Nov 16, 2015, at 6:02 AM, Renlin Li  wrote:
> On 14/11/15 00:33, David Edelsohn wrote:
>> No RISC architecture can store directly to MEM, so the expected RTL in
>> g++.dg/init/vbase1.C is wrong.  I am adding XFAIL for PowerPC.  This
>> probably should be disabled for ARM and other RISC architectures.
> 
> I observed the same problem in arm.
> 
> This passes for aarch64 and mips as they have zero register to do that. 
> However, other RISC might not have that feature, for example arm and RS6000 
> in this  case.

I fixed this with the below patch.  Tested on x86_64 linux, x86_64 darwin and 
my port.  If you want to list aarch64/arn and mips, please do.


* g++.dg/init/vbase1.C: Only run on x86_64-*-* as this testcase
isn't portable.

Index: g++.dg/init/vbase1.C
===
--- g++.dg/init/vbase1.C(revision 230675)
+++ g++.dg/init/vbase1.C(working copy)
@@ -42,4 +42,4 @@ int main(int, char**)
 // Verify that the SubB() mem-initializer is storing 0 directly into
 // this->D.whatever rather than into a stack temp that is then copied into the
 // base field.
-// { dg-final { scan-rtl-dump "set 
\[^\n\]*\n\[^\n\]*this\[^\n\]*\n\[^\n\]*const_int 0" "expand" { xfail { 
powerpc*-*-* } } } }
+// { dg-final { scan-rtl-dump "set 
\[^\n\]*\n\[^\n\]*this\[^\n\]*\n\[^\n\]*const_int 0" "expand" { target { 
x86_64-*-* } } } }



Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-20 Thread Eric Botcazou
> I would hazard a guess that the authors simply didn't consider the
> multi-hard reg case.  Essentially if the original set reached an
> extension, then obviously the original set got there unharmed and the
> extended destination should reach as well -- except that doesn't apply
> to multi-word hard regs.

This pass started as a Zero-Extension Elimination pass for x86-64 and was only 
considering implicit SI->DI extensions initially.  The algorithm was tailored 
to this specific pattern and I agree that the pass should be reimplemented.

-- 
Eric Botcazou


Re: [PATCH 3b/4][AArch64] Add scheduling model for Exynos M1

2015-11-20 Thread Evandro Menezes

On 11/20/2015 11:17 AM, James Greenhalgh wrote:

On Tue, Nov 10, 2015 at 11:54:00AM -0600, Evandro Menezes wrote:

2015-11-10  Evandro Menezes 

gcc/

* config/aarch64/aarch64-cores.def: Use the Exynos M1 sched model.
* config/aarch64/aarch64.md: Include "exynos-m1.md".
* config/arm/arm-cores.def: Use the Exynos M1 sched model.
* config/arm/arm.md: Include "exynos-m1.md".
* config/arm/arm-tune.md: Regenerated.
* config/arm/exynos-m1.md: New file.

This patch adds the scheduling model for Exynos M1.  It depends on
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01257.html

Bootstrapped on arm-unknown-linux-gnueabihf, aarch64-unknown-linux-gnu.

Please, commit if it's alright.



 From 0b7b6d597e5877c78c4d88e0d4491858555a5364 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 9 Nov 2015 17:18:52 -0600
Subject: [PATCH 2/2] [AArch64] Add scheduling model for Exynos M1

gcc/
* config/aarch64/aarch64-cores.def: Use the Exynos M1 sched model.
* config/aarch64/aarch64.md: Include "exynos-m1.md".

These changes are fine.


* config/arm/arm-cores.def: Use the Exynos M1 sched model.
* config/arm/arm.md: Include "exynos-m1.md".
* config/arm/arm-tune.md: Regenerated.

These changes need an ack from an ARM reviewer.


* config/arm/exynos-m1.md: New file.

I have a few comments on this model.


+;; The Exynos M1 core is modeled as a triple issue pipeline that has
+;; the following functional units.
+
+(define_automaton "exynos_m1_gp")
+(define_automaton "exynos_m1_ls")
+(define_automaton "exynos_m1_fp")
+
+;; 1.  Two pipelines for simple integer operations: A, B
+;; 2.  One pipeline for simple or complex integer operations: C
+
+(define_cpu_unit "em1_xa, em1_xb, em1_xc" "exynos_m1_gp")
+
+(define_reservation "em1_alu" "(em1_xa | em1_xb | em1_xc)")
+(define_reservation "em1_c" "em1_xc")

Is this extra reservation useful, can we not just use em1_xc directly?


+;; 3.  Two asymmetric pipelines for Neon and FP operations: F0, F1
+
+(define_cpu_unit "em1_f0, em1_f1" "exynos_m1_fp")
+
+(define_reservation "em1_fmac" "em1_f0")
+(define_reservation "em1_fcvt" "em1_f0")
+(define_reservation "em1_nalu" "(em1_f0 | em1_f1)")
+(define_reservation "em1_nalu0" "em1_f0")
+(define_reservation "em1_nalu1" "em1_f1")
+(define_reservation "em1_nmisc" "em1_f0")
+(define_reservation "em1_ncrypt" "em1_f0")
+(define_reservation "em1_fadd" "em1_f1")
+(define_reservation "em1_fvar" "em1_f1")
+(define_reservation "em1_fst" "em1_f1")

Same comment here, does this not just obfuscate the interaction between
instruction classes in the description. I'm not against doing it this way
if you prefer, but it would seem to reduce readability to me. I think there
is also an argument that this increases readability, so it is your choice.


+
+;; 4.  One pipeline for branch operations: BX
+
+(define_cpu_unit "em1_bx" "exynos_m1_gp")
+
+(define_reservation "em1_br" "em1_bx")
+

And again?


+;; 5.  One AGU for loads: L
+;; One AGU for stores and one pipeline for stores: S, SD
+
+(define_cpu_unit "em1_lx" "exynos_m1_ls")
+(define_cpu_unit "em1_sx, em1_sd" "exynos_m1_ls")
+
+(define_reservation "em1_ld" "em1_lx")
+(define_reservation "em1_st" "(em1_sx + em1_sd)")
+
+;; Common occurrences
+(define_reservation "em1_sfst" "(em1_fst + em1_st)")
+(define_reservation "em1_lfst" "(em1_fst + em1_ld)")
+
+;; Branches
+;;
+;; No latency as there is no result
+;; TODO: Unconditional branches use no units;
+;; conditional branches add the BX unit;
+;; indirect branches add the C unit.
+(define_insn_reservation "exynos_m1_branch" 0
+  (and (eq_attr "tune" "exynosm1")
+   (eq_attr "type" "branch"))
+  "em1_br")
+
+(define_insn_reservation "exynos_m1_call" 1
+  (and (eq_attr "tune" "exynosm1")
+   (eq_attr "type" "call"))
+  "em1_alu")
+
+;; Basic ALU
+;;
+;; Simple ALU without shift, non-predicated
+(define_insn_reservation "exynos_m1_alu" 1
+  (and (eq_attr "tune" "exynosm1")
+   (and (not (eq_attr "predicated" "yes"))

(and (eq_attr "predicated" "no")) ?

Likewise throughout the file? Again this is your choice.

This is OK from the AArch64 side, let me know if you plan to change any
of the above, otherwise I'll commit it (or someone else can commit it)
after I see an OK from an ARM reviewer.


The naming choices were made more for uniformity's sake and to conform 
with the processor documentation.


The bit about "not... yes" <=> "no" is a goof up. :-s

I'm waiting for Kyrill's patch adding FCCMP to be approved to amend this 
patch.  However, I'd appreciate if it'd be approved and committed before 
then, as soon the ARM stuff is okayed.


Thank you,

--
Evandro Menezes



[SPARC] Minor cleanup

2015-11-20 Thread Eric Botcazou
This just moves the VIS3 mulx patterns to a more appropriate place.

Tested on SPARC/Solaris, applied on the mainline.


2015-11-20  Eric Botcazou  

* config/sparc/sparc.md (umulxhi_vis): Move around.
(*umulxhi_sp64): Likewise.
(umulxhi_v8plus): Likewise.
(xmulx_vis): Likewise.
(*xmulx_sp64): Likewise.
(xmulx_v8plus): Likewise.
(xmulxhi_vis): Likewise.
(*xmulxhi_sp64): Likewise.
(xmulxhi_v8plus): Likewise.

-- 
Eric BotcazouIndex: config/sparc/sparc.md
===
--- config/sparc/sparc.md	(revision 230589)
+++ config/sparc/sparc.md	(working copy)
@@ -609,7 +609,7 @@ (define_insn "*cmptf_fp"
   return "fcmpq\t%1, %2";
 }
   [(set_attr "type" "fpcmp")])
-
+
 ;; Next come the scc insns.
 
 ;; Note that the boolean result (operand 0) takes on DImode
@@ -647,8 +647,6 @@ (define_expand "cstore4"
   "TARGET_FPU"
   { if (emit_scc_insn (operands)) DONE; else FAIL; })
 
-
-
 ;; Seq_special[_xxx] and sne_special[_xxx] clobber the CC reg, because they
 ;; generate addcc/subcc instructions.
 
@@ -1137,7 +1135,7 @@ (define_split
 			 (match_dup 0)))]
   "")
 
-
+
 ;; These control RTL generation for conditional jump insns
 
 (define_expand "cbranchcc4"
@@ -1318,7 +1316,6 @@ (define_insn "*cbcond_sp64"
 
 ;; There are no 32 bit brreg insns.
 
-;; XXX
 (define_insn "*normal_int_branch_sp64"
   [(set (pc)
 	(if_then_else (match_operator 0 "v9_register_compare_operator"
@@ -1335,7 +1332,6 @@ (define_insn "*normal_int_branch_sp64"
   [(set_attr "type" "branch")
(set_attr "branch_type" "reg")])
 
-;; XXX
 (define_insn "*inverted_int_branch_sp64"
   [(set (pc)
 	(if_then_else (match_operator 0 "v9_register_compare_operator"
@@ -2730,8 +2726,6 @@ (define_expand "movcc"
   DONE;
 })
 
-;; Conditional move define_insns
-
 (define_insn "*mov_cc_v9"
   [(set (match_operand:I 0 "register_operand" "=r")
 	(if_then_else:I (match_operator 1 "comparison_operator"
@@ -2896,7 +2890,7 @@ (define_insn_and_split "*movtf_cc_reg_sp
 }
   [(set_attr "length" "2")])
 
-
+
 ;; Zero-extension instructions
 
 ;; These patterns originally accepted general_operands, however, slightly
@@ -4043,7 +4037,6 @@ (define_insn "*muldi3_sp64"
   [(set_attr "type" "imul")])
 
 ;; V8plus wide multiply.
-;; XXX
 (define_insn "muldi3_v8plus"
   [(set (match_operand:DI 0 "register_operand" "=r,h")
 	(mult:DI (match_operand:DI 1 "arith_operand" "%r,0")
@@ -4094,7 +4087,6 @@ (define_expand "mulsidi3"
 
 ;; V9 puts the 64-bit product in a 64-bit register.  Only out or global
 ;; registers can hold 64-bit values in the V8plus environment.
-;; XXX
 (define_insn "mulsidi3_v8plus"
   [(set (match_operand:DI 0 "register_operand" "=h,r")
 	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r,r"))
@@ -4107,7 +4099,6 @@ (define_insn "mulsidi3_v8plus"
   [(set_attr "type" "multi")
(set_attr "length" "2,3")])
 
-;; XXX
 (define_insn "const_mulsidi3_v8plus"
   [(set (match_operand:DI 0 "register_operand" "=h,r")
 	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r,r"))
@@ -4120,7 +4111,6 @@ (define_insn "const_mulsidi3_v8plus"
   [(set_attr "type" "multi")
(set_attr "length" "2,3")])
 
-;; XXX
 (define_insn "*mulsidi3_sp32"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
@@ -4148,7 +4138,6 @@ (define_insn "*mulsidi3_sp64"
 
 ;; Extra pattern, because sign_extend of a constant isn't valid.
 
-;; XXX
 (define_insn "const_mulsidi3_sp32"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
@@ -4203,7 +4192,6 @@ (define_expand "smulsi3_highpart"
 }
 })
 
-;; XXX
 (define_insn "smulsi3_highpart_v8plus"
   [(set (match_operand:SI 0 "register_operand" "=h,r")
 	(truncate:SI
@@ -4219,7 +4207,6 @@ (define_insn "smulsi3_highpart_v8plus"
(set_attr "length" "2")])
 
 ;; The combiner changes TRUNCATE in the previous pattern to SUBREG.
-;; XXX
 (define_insn ""
   [(set (match_operand:SI 0 "register_operand" "=h,r")
 	(subreg:SI
@@ -4236,7 +4223,6 @@ (define_insn ""
   [(set_attr "type" "multi")
(set_attr "length" "2")])
 
-;; XXX
 (define_insn "const_smulsi3_highpart_v8plus"
   [(set (match_operand:SI 0 "register_operand" "=h,r")
 	(truncate:SI
@@ -4251,7 +4237,6 @@ (define_insn "const_smulsi3_highpart_v8p
   [(set_attr "type" "multi")
(set_attr "length" "2")])
 
-;; XXX
 (define_insn "*smulsi3_highpart_sp32"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(truncate:SI
@@ -4263,7 +4248,6 @@ (define_insn "*smulsi3_highpart_sp32"
   [(set_attr "type" "multi")
(set_attr "length" "2")])
 
-;; XXX
 (define_insn "const_smulsi3_highpart"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(truncate:SI
@@ -4301,7 +4285,6 @@ (define_expand "umulsidi3"
 }
 })
 
-;; XXX
 (define_insn "umulsidi3_v8plus"
   [(set (match_operand:DI 0 "register_operan

Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-20 Thread Jeff Law

On 11/20/2015 02:19 AM, Nick Clifton wrote:

Hi Jeff,


The code there would solve this problem, but the approach is is overly
cautious, since it disables the optimization for all extensions that
increase the number of hard registers used.  Some of these will be
viable candidates, provided that the extra hard registers are no used.
(This is certainly true for the RL78, where the (patched) optimization
does improve code, even though the widening does use extra registers).

Nick -- can you pass along your testcode?


Sure - this is for the RL78 toolchain.  In theory the problem is
generic, but I have not tested other toolchains.

Compile the gcc.c-torture/execute/pr42833.c or
gcc.c-torture/execute/strct-stdarg-1.c tests at -O1 or higher with -free
also specified on the command line.  Without -free these tests pass.
With -free they fail.
So this looks like a pretty fundamental failing in ree.c and AFAICT has 
been there since day 1.  Thankfully, I don't think it's likely to 
trigger all that often.


It's useful to ponder two cases.  One where we're going to add a copy 
and one where we don't.  I added the copy case during the 4.9 cycle as 
an extension of existing code.  It only fires in easy to analyze 
circumstances -- when the original set and the extension are in the same 
basic block, but have different destination registers.


In the copy case.  The code checks that the destination of the 
*extension* is neither used nor set between the original set and the 
extension.  So in the case of multi-word hard reg, we'll verify that the 
entire multi-word hard reg survives from the original insn to the 
extension.  That avoids this problem in cases where a copy is needed.


However, in the case where no copy is needed, no such checks are made. 
In fact the checks could be fairly painful as the original set and the 
extension can be in different blocks with arbitrary flow between them.


I would hazard a guess that the authors simply didn't consider the 
multi-hard reg case.  Essentially if the original set reached an 
extension, then obviously the original set got there unharmed and the 
extended destination should reach as well -- except that doesn't apply 
to multi-word hard regs.


Given the expense in checking all the potential paths between the 
original set and the extension, I think just disallowing this case ought 
to be is the best way to go.  Ideally we'll filter out that case before 
we get to combine_set_extension.  But if we can't, it's easy enough to 
test there.


Jeff


libgo patch committed: use correct tool dir with gccgo

2015-11-20 Thread Ian Lance Taylor
This patch from Lynn Boger fixes the go tool shipped with gccgo to use
the correct tool directory.  It also fixes the 'go tool' output to
only list known tools.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline and GCC 5 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230657)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-dfa74d975884f363c74d6a66a37b1703093fdba6
+d52835c9376985f92f35c32af5f1808239981536
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/pkg.go
===
--- libgo/go/cmd/go/pkg.go  (revision 230463)
+++ libgo/go/cmd/go/pkg.go  (working copy)
@@ -785,7 +785,11 @@ func (p *Package) load(stk *importStack,
if goTools[p.ImportPath] == toTool {
// This is for 'go tool'.
// Override all the usual logic and force it into the 
tool directory.
-   p.target = filepath.Join(gorootPkg, "tool", full)
+   if buildContext.Compiler == "gccgo" {
+   p.target = filepath.Join(runtime.GCCGOTOOLDIR, 
elem)
+   } else {
+   p.target = filepath.Join(gorootPkg, "tool", 
full)
+   }
}
if p.target != "" && buildContext.GOOS == "windows" {
p.target += ".exe"
Index: libgo/go/cmd/go/tool.go
===
--- libgo/go/cmd/go/tool.go (revision 230463)
+++ libgo/go/cmd/go/tool.go (working copy)
@@ -39,6 +39,12 @@ var (
toolN bool
 )
 
+// List of go tools found in the gccgo tool directory.
+// Other binaries could be in the same directory so don't
+// show those with the 'go tool' command.
+
+var gccgoTools = []string{"cgo", "fix", "cover", "godoc", "vet"}
+
 func init() {
cmdTool.Flag.BoolVar(&toolN, "n", false, "")
 }
@@ -146,6 +152,21 @@ func listTools() {
if toolIsWindows && strings.HasSuffix(name, 
toolWindowsExtension) {
name = name[:len(name)-len(toolWindowsExtension)]
}
-   fmt.Println(name)
+
+   // The tool directory used by gccgo will have other binaries
+   // in additions to go tools.  Only display go tools for this 
list.
+
+   if buildContext.Compiler == "gccgo" {
+   for _, tool := range gccgoTools {
+   if tool == name {
+   fmt.Println(name)
+   }
+   }
+   } else {
+
+   // Not gccgo, list all the tools found in this dir
+
+   fmt.Println(name)
+   }
}
 }


Re: RFA: PATCH to match.pd for c++/68385

2015-11-20 Thread Jason Merrill

On 11/20/2015 02:58 PM, Jason Merrill wrote:

OK if testing passes?


Which it did.

Jason



[PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-20 Thread David Malcolm
The attached patch fixes some more places in c/c-parser.c where the
src_range field of a c_expr was being left uninitialized, this time for
various Objective C constructs.

The source ranges are verified using the same unit-testing plugin used
for C expressions.  This leads to a wart, which is that it means having
a .m test file lurking below gcc.dg/plugin.  A workaround would be to
create an objc.dg/plugin subdirectory, with a new plugin.exp for testing
Objective C plugins, and having it reference the existing plugin below
gcc.dg/plugin.  That seemed like excessive complication, so I went for
the simpler approach of simply putting the .m file below gcc.dg/plugin.

I manually verified that this fixes the specific valgrind warnings
reported in the PR, and visually inspected the code for
objective-c-specific instances of uninitialized c_expr src_range values.

Bootstrapped®rtested on x86_64-pc-linux-gnu; adds 15 PASS results to
gcc.sum.

OK for trunk?

>From afdae8b15f71164d0d05e790078519b38bd674a4 Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Fri, 20 Nov 2015 11:12:47 -0500
Subject: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

gcc/c/ChangeLog:
	PR objc/68438
	* c-parser.c (c_parser_postfix_expression): Set up source ranges
	for various Objective-C constructs: Class.name syntax,
	@selector(), @protocol, @encode(), and [] message syntax.

gcc/testsuite/ChangeLog:
	PR objc/68438
	* gcc.dg/plugin/diagnostic-test-expressions-2.m: New test file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above file.
---
 gcc/c/c-parser.c   | 17 +++-
 .../gcc.dg/plugin/diagnostic-test-expressions-2.m  | 94 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |  3 +-
 3 files changed, 110 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-2.m

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7b10764..18e9957 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7338,10 +7338,13 @@ c_parser_postfix_expression (c_parser *parser)
 		expr.value = error_mark_node;
 		break;
 	  }
-	component = c_parser_peek_token (parser)->value;
+	c_token *component_tok = c_parser_peek_token (parser);
+	component = component_tok->value;
+	location_t end_loc = component_tok->get_finish ();
 	c_parser_consume_token (parser);
 	expr.value = objc_build_class_component_ref (class_name, 
 			 component);
+	set_c_expr_source_range (&expr, loc, end_loc);
 	break;
 	  }
 	default:
@@ -7816,9 +7819,11 @@ c_parser_postfix_expression (c_parser *parser)
 	}
 	  {
 	tree sel = c_parser_objc_selector_arg (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_selector_expr (loc, sel);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_PROTOCOL:
@@ -7839,9 +7844,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  {
 	tree id = c_parser_peek_token (parser)->value;
 	c_parser_consume_token (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_protocol_expr (id);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_ENCODE:
@@ -7860,11 +7867,13 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
 	  break;
 	}
-	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
- "expected %<)%>");
 	  {
+	location_t close_loc = c_parser_peek_token (parser)->location;
+	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
+ "expected %<)%>");
 	tree type = groktypename (t1, NULL, NULL);
 	expr.value = objc_build_encode_expr (type);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_GENERIC:
@@ -7907,9 +7916,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_consume_token (parser);
 	  receiver = c_parser_objc_receiver (parser);
 	  args = c_parser_objc_message_args (parser);
+	  location_t close_loc = c_parser_peek_token (parser)->location;
 	  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
  "expected %<]%>");
 	  expr.value = objc_build_message_expr (receiver, args);
+	  set_c_expr_source_range (&expr, loc, close_loc);
 	  break;
 	}
   /* Else fall through to report error.  */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-2.m b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-2.m
new file mode 100644
index 000..ed7aca3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-2.m
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This file is similar to diagnostic

[PATCH] Fix debug info related ICE when inlining back fnsplit function (PR debug/66432)

2015-11-20 Thread Jakub Jelinek
Hi!

This patch fixes ICE where a parameter mentioned in a debug source bind
has its VLA type mistakenly remapped and that leads to inconsistent type
between the PARM_DECL and SSA_NAMEs derived from it.

The patch Tom posted for this can't work, because we assume that the
s=> value as well as debug_decl_args decl in that case is DECL_ORIGIN
of the PARM_DECL.  But, in this case we are replacing the DECL_ORIGIN
PARM_DECL immediately with a DEBUG_EXPR_DECL anyway, so there is no
point remapping the var we don't own (it could be in a different function
etc.).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5.3?

2015-11-20  Jakub Jelinek  

PR debug/66432
* tree-inline.c (copy_debug_stmt): If
gimple_debug_source_bind_get_value is DECL_ORIGIN of a PARM_DECL
in decl_debug_args, don't call remap_gimple_op_r on it.

* gcc.dg/debug/pr66432.c: New test.

--- gcc/tree-inline.c.jj2015-11-14 19:36:03.0 +0100
+++ gcc/tree-inline.c   2015-11-20 17:17:00.632082622 +0100
@@ -2864,8 +2864,6 @@ copy_debug_stmt (gdebug *stmt, copy_body
   else if (gimple_debug_source_bind_p (stmt))
 {
   gimple_debug_source_bind_set_var (stmt, t);
-  walk_tree (gimple_debug_source_bind_get_value_ptr (stmt),
-remap_gimple_op_r, &wi, NULL);
   /* When inlining and source bind refers to one of the optimized
 away parameters, change the source bind into normal debug bind
 referring to the corresponding DEBUG_EXPR_DECL that should have
@@ -2889,7 +2887,10 @@ copy_debug_stmt (gdebug *stmt, copy_body
break;
  }
}
-   }  
+   }
+  if (gimple_debug_source_bind_p (stmt))
+   walk_tree (gimple_debug_source_bind_get_value_ptr (stmt),
+  remap_gimple_op_r, &wi, NULL);
 }
 
   processing_debug_stmt = 0;
--- gcc/testsuite/gcc.dg/debug/pr66432.c.jj 2015-11-20 17:40:44.589171083 
+0100
+++ gcc/testsuite/gcc.dg/debug/pr66432.c2015-11-20 17:38:48.0 
+0100
@@ -0,0 +1,19 @@
+/* PR debug/66432 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g" } */
+
+extern void baz (const char *, const char *) __attribute__ ((__noreturn__));
+
+void
+foo (int x, int y[x][x])
+{
+  if (x < 2)
+baz ("", "");
+}
+
+void
+bar (void)
+{
+  int z[2][2] = { { 1, 2 }, { 3, 4 } };
+  foo (2, z);
+}

Jakub


Re: [PATCH] (Partial) Implementation of simplificaiton of CSHIFT

2015-11-20 Thread Steve Kargl
On Thu, Nov 19, 2015 at 04:58:36PM -0800, Steve Kargl wrote:
> 
> 2015-11-19  Steven G. Kargl  
> 
>   * intrinsic.h: Prototype for gfc_simplify_cshift
>   * intrinsic.c (add_functions): Use gfc_simplify_cshift.
>   * simplify.c (gfc_simplify_cshift): Implement simplification of CSHIFT.
>   (gfc_simplify_spread): Remove a FIXME and add error condition.
>  
> 2015-11-19  Steven G. Kargl  
> 
>   * gfortran.dg/simplify_cshift_1.f90: New test.
> 

I've attached an updated patch.  The changes consists of
1) better/more comments
2) re-organize code to reduce copying of the array.
3) add optimization for a left/right shift that 
   returns the original array.
4) Don't leak memory.

-- 
Steve
Index: gcc/fortran/intrinsic.c
===
--- gcc/fortran/intrinsic.c	(revision 230585)
+++ gcc/fortran/intrinsic.c	(working copy)
@@ -1659,9 +1659,11 @@ add_functions (void)
 
   make_generic ("count", GFC_ISYM_COUNT, GFC_STD_F95);
 
-  add_sym_3 ("cshift", GFC_ISYM_CSHIFT, CLASS_TRANSFORMATIONAL, ACTUAL_NO, BT_REAL, dr, GFC_STD_F95,
-	 gfc_check_cshift, NULL, gfc_resolve_cshift,
-	 ar, BT_REAL, dr, REQUIRED, sh, BT_INTEGER, di, REQUIRED,
+  add_sym_3 ("cshift", GFC_ISYM_CSHIFT, CLASS_TRANSFORMATIONAL, ACTUAL_NO,
+	 BT_REAL, dr, GFC_STD_F95,
+	 gfc_check_cshift, gfc_simplify_cshift, gfc_resolve_cshift,
+	 ar, BT_REAL, dr, REQUIRED,
+	 sh, BT_INTEGER, di, REQUIRED,
 	 dm, BT_INTEGER, ii, OPTIONAL);
 
   make_generic ("cshift", GFC_ISYM_CSHIFT, GFC_STD_F95);
Index: gcc/fortran/intrinsic.h
===
--- gcc/fortran/intrinsic.h	(revision 230585)
+++ gcc/fortran/intrinsic.h	(working copy)
@@ -271,6 +271,7 @@ gfc_expr *gfc_simplify_conjg (gfc_expr *
 gfc_expr *gfc_simplify_cos (gfc_expr *);
 gfc_expr *gfc_simplify_cosh (gfc_expr *);
 gfc_expr *gfc_simplify_count (gfc_expr *, gfc_expr *, gfc_expr *);
+gfc_expr *gfc_simplify_cshift (gfc_expr *, gfc_expr *, gfc_expr *);
 gfc_expr *gfc_simplify_dcmplx (gfc_expr *, gfc_expr *);
 gfc_expr *gfc_simplify_dble (gfc_expr *);
 gfc_expr *gfc_simplify_digits (gfc_expr *);
Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c	(revision 230585)
+++ gcc/fortran/simplify.c	(working copy)
@@ -1789,6 +1789,99 @@ gfc_simplify_count (gfc_expr *mask, gfc_
 
 
 gfc_expr *
+gfc_simplify_cshift (gfc_expr *array, gfc_expr *shift, gfc_expr *dim)
+{
+  int dm;
+  gfc_expr *a;
+
+  /* DIM is only useful for rank > 1, but deal with it here as one can
+ set DIM = 1 for rank = 1.  */
+  if (dim)
+{
+  if (!gfc_is_constant_expr (dim))
+	  return NULL;
+  dm = mpz_get_si (dim->value.integer);
+}
+  else
+dm = 1;
+
+  /* Copy array into 'a', simplify it, and then test for a constant array.
+ Unexpected expr_type cause an ICE.   */
+  switch (array->expr_type)
+{
+  case EXPR_VARIABLE:
+  case EXPR_ARRAY:
+	a = gfc_copy_expr (array);
+	gfc_simplify_expr (a, 0);
+	if (!is_constant_array_expr (a))
+	  {
+	gfc_free_expr (a);
+	return NULL;
+	  }
+	break;
+  default:
+	gcc_unreachable ();
+}
+
+  if (a->rank == 1)
+{
+  gfc_constructor *ca, *cr;
+  gfc_expr *result;
+  mpz_t size;
+  int i, j, shft, sz;
+
+  if (!gfc_is_constant_expr (shift))
+	{
+	  gfc_free_expr (a);
+	  return NULL;
+	}
+
+  shft = mpz_get_si (shift->value.integer);
+
+  /*  Case (i):  If ARRAY has rank one, element i of the result is
+	  ARRAY (1 + MODULO (i + SHIFT ­ 1, SIZE (ARRAY))).  */
+
+  mpz_init (size);
+  gfc_array_size (a, &size);
+  sz = mpz_get_si (size);
+  mpz_clear (size);
+
+  /* Special case: rank 1 array with no shift or a complete shift to
+	 the original order!  */
+  if (shft == 0 || shft == sz || shft == 1 - sz)
+	return a;
+
+  /* Adjust shft to deal with right or left shifts. */
+  shft = shft < 0 ? 1 - shft : shft;
+
+  result = gfc_copy_expr (a);
+  cr = gfc_constructor_first (result->value.constructor);
+  for (i = 0; i < sz; i++, cr = gfc_constructor_next (cr))
+	{
+	  j = (i + shft) % sz;
+	  ca = gfc_constructor_first (a->value.constructor);
+	  while (j-- > 0)
+	ca = gfc_constructor_next (ca);
+	  cr->expr = gfc_copy_expr (ca->expr);
+	}
+
+  gfc_free_expr (a);
+  return result;
+}
+  else
+{
+  /* FIXME: Deal with rank > 1 arrays.  For now, don't leak memory
+	 and exit with an error message.  */
+  gfc_free_expr (a);
+  gfc_error ("Simplification of CSHIFT with an array with rank > 1 "
+	 "no yet support");
+}
+
+  return NULL;
+}
+
+
+gfc_expr *
 gfc_simplify_dcmplx (gfc_expr *x, gfc_expr *y)
 {
   return simplify_cmplx ("DCMPLX", x, y, gfc_default_double_kind);
@@ -6089,10 +6182,11 @@ gfc_simplify_spread (gfc_expr *source, g
 	}
 }
   else
-/* FIXME: Returning here avoids a regression in array_simplify_

[PATCH] Fix up reduction-1{1,2} testcases (PR middle-end/68221)

2015-11-20 Thread Jakub Jelinek
Hi!

If C/C++ array section reductions have non-zero (positive) bias, it is
implemented by declaring a smaller private array and subtracting the bias
from the start of the private array (because valid code may only dereference
elements from bias onwards).  But, this isn't something that is kosher in
C/C++ pointer arithmetics and the alias oracle seems to get upset on that.
So, the following patch fixes that by performing the subtraction on integral
type instead of p+ -bias.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2015-11-20  Jakub Jelinek  

PR middle-end/68221
* omp-low.c (lower_rec_input_clauses): If C/C++ array reduction
has non-zero bias, subtract it in integer type instead of
pointer plus of negated bias.

* testsuite/libgomp.c/reduction-11.c: Remove xfail.
* testsuite/libgomp.c/reduction-12.c: Likewise.
* testsuite/libgomp.c++/reduction-11.C: Likewise.
* testsuite/libgomp.c++/reduction-12.C: Likewise.

--- gcc/omp-low.c.jj2015-11-20 12:56:17.0 +0100
+++ gcc/omp-low.c   2015-11-20 13:44:29.080374051 +0100
@@ -,11 +,13 @@ lower_rec_input_clauses (tree clauses, g
 
  if (!integer_zerop (bias))
{
- bias = fold_convert_loc (clause_loc, sizetype, bias);
- bias = fold_build1_loc (clause_loc, NEGATE_EXPR,
- sizetype, bias);
- x = fold_build2_loc (clause_loc, POINTER_PLUS_EXPR,
-  TREE_TYPE (x), x, bias);
+ bias = fold_convert_loc (clause_loc, pointer_sized_int_node,
+  bias);
+ yb = fold_convert_loc (clause_loc, pointer_sized_int_node,
+x);
+ yb = fold_build2_loc (clause_loc, MINUS_EXPR,
+   pointer_sized_int_node, yb, bias);
+ x = fold_convert_loc (clause_loc, TREE_TYPE (x), yb);
  yb = create_tmp_var (ptype, name);
  gimplify_assign (yb, x, ilist);
  x = yb;
--- libgomp/testsuite/libgomp.c/reduction-11.c.jj   2015-11-05 
16:03:53.0 +0100
+++ libgomp/testsuite/libgomp.c/reduction-11.c  2015-11-20 13:38:24.448520879 
+0100
@@ -1,4 +1,4 @@
-/* { dg-do run { xfail *-*-* } } */
+/* { dg-do run } */
 
 char z[10] = { 0 };
 
--- libgomp/testsuite/libgomp.c/reduction-12.c.jj   2015-11-05 
16:03:53.0 +0100
+++ libgomp/testsuite/libgomp.c/reduction-12.c  2015-11-20 13:38:34.565378078 
+0100
@@ -1,4 +1,4 @@
-/* { dg-do run { xfail *-*-* } } */
+/* { dg-do run } */
 
 struct A { int t; };
 struct B { char t; };
--- libgomp/testsuite/libgomp.c++/reduction-11.C.jj 2015-11-05 
16:03:53.0 +0100
+++ libgomp/testsuite/libgomp.c++/reduction-11.C2015-11-20 
13:37:53.921951766 +0100
@@ -1,4 +1,4 @@
-// { dg-do run { xfail *-*-* } }
+// { dg-do run }
 
 char z[10] = { 0 };
 
--- libgomp/testsuite/libgomp.c++/reduction-12.C.jj 2015-11-05 
16:03:53.0 +0100
+++ libgomp/testsuite/libgomp.c++/reduction-12.C2015-11-20 
13:38:03.983809741 +0100
@@ -1,4 +1,4 @@
-// { dg-do run { xfail *-*-* } }
+// { dg-do run }
 
 template 
 struct A

Jakub


[PATCH] Fix GC ICE during simd clone creation (PR middle-end/68339)

2015-11-20 Thread Jakub Jelinek
Hi!

node->get_body () can run various IPA passes and ggc_collect in them, so
it is undesirable to hold pointers to GC memory in automatic vars over it.
While I could store those vars (clone_info, clone and id) into special GTY
vars just to avoid collecting them, it seems easier to call node->get_body
() earlier.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
and 5 branch.

2015-11-20  Jakub Jelinek  

PR middle-end/68339
* omp-low.c (expand_simd_clones): Call node->get_body () before
allocating stuff in GC.

* gcc.dg/vect/pr68339.c: New test.

--- gcc/omp-low.c.jj2015-11-18 11:19:19.0 +0100
+++ gcc/omp-low.c   2015-11-20 12:56:17.075193601 +0100
@@ -18319,6 +18319,10 @@ expand_simd_clones (struct cgraph_node *
   && TYPE_ARG_TYPES (TREE_TYPE (node->decl)) == NULL_TREE)
 return;
 
+  /* Call this before creating clone_info, as it might ggc_collect.  */
+  if (node->definition && node->has_gimple_body_p ())
+node->get_body ();
+
   do
 {
   /* Start with parsing the "omp declare simd" attribute(s).  */
--- gcc/testsuite/gcc.dg/vect/pr68339.c.jj  2015-11-20 13:10:47.756905395 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr68339.c 2015-11-20 13:08:13.0 +0100
@@ -0,0 +1,17 @@
+/* PR middle-end/68339 */
+/* { dg-do compile } */
+/* { dg-options "--param ggc-min-heapsize=0 --param ggc-min-expand=0 
-fopenmp-simd" } */
+
+#pragma omp declare simd notinbranch
+int
+f1 (int x)
+{
+  return x;
+}
+
+#pragma omp declare simd notinbranch
+int
+f2 (int x)
+{
+  return x;
+}

Jakub


RFA: PATCH to match.pd for c++/68385

2015-11-20 Thread Jason Merrill
In this bug, we hit the (A & sign-bit) != 0 -> A < 0 transformation. 
Because of delayed folding, the operands aren't fully folded yet, so we 
have NOP_EXPRs around INTEGER_CSTs, and so calling wi::only_sign_bit_p 
ICEs.  We've been seeing several similar bugs, where code calls 
integer_zerop and therefore assumes that they have an INTEGER_CST, but 
in fact integer_zerop does STRIP_NOPS.


This patch changes the pattern to only match if the operand is actually 
an INTEGER_CST.  Alternatively we could call tree_strip_nop_conversions 
on the operand, but I would expect that to have issues when the 
conversion changes the signedness of the type.


OK if testing passes?
commit e7b45ed6775c88c6d48c5863738ba0db2e38fc5e
Author: Jason Merrill 
Date:   Fri Nov 20 14:40:35 2015 -0500

	PR c++/68385

	* match.pd: Don't assume that integer_pow2p implies INTEGER_CST.

diff --git a/gcc/match.pd b/gcc/match.pd
index e86cc8b..1981ae7 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2232,6 +2232,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& (TYPE_PRECISION (TREE_TYPE (@0))
 	   == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
&& element_precision (@2) >= element_precision (@0)
+   && TREE_CODE (@1) == INTEGER_CST
&& wi::only_sign_bit_p (@1, element_precision (@0)))
(with { tree stype = signed_type_for (TREE_TYPE (@0)); }
 (ncmp (convert:stype @0) { build_zero_cst (stype); })


Re: [PATCH, VECTOR ABI] Add __attribute__((__simd__)) to GCC.

2015-11-20 Thread Jeff Law

On 11/20/2015 05:15 AM, Kyrill Tkachov wrote:

Hi Kirill,

On 18/11/15 14:11, Kirill Yukhin wrote:

Hello Andreas, Devid.

On 18 Nov 10:45, Andreas Schwab wrote:

Kirill Yukhin  writes:


diff --git a/gcc/testsuite/c-c++-common/attr-simd.c
b/gcc/testsuite/c-c++-common/attr-simd.c
new file mode 100644
index 000..b4eda34
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-simd.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-optimized" } */
+
+__attribute__((__simd__))
+extern
+int simd_attr (void)
+{
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "simd_attr\[ \\t\]simdclone|vector"
"optimized" } } */

On ia64:

FAIL: c-c++-common/attr-simd.c  -Wc++-compat   scan-tree-dump
optimized "simd_attr[ \\t]simdclone|vector"
FAIL: c-c++-common/attr-simd.c  -Wc++-compat   scan-tree-dump
optimized "simd_attr2[ \\t]simdclone|vector"

$ grep simd_attr attr-simd.c.194t.optimized
;; Function simd_attr (simd_attr, funcdef_no=0, decl_uid=1389,
cgraph_uid=0, symbol_order=0)
simd_attr ()
;; Function simd_attr2 (simd_attr2, funcdef_no=1, decl_uid=1392,
cgraph_uid=1, symbol_order=1)
simd_attr2 ()

As far as vABI is supported on x86_64/i?86 only, I am going to enable
mentioned `scan-tree-dump' only
for these targets. This should cure both IA64 and Power.

Concerning attr-simd-3.c. It is known issue: PR68158.
And I believe this test should work everywhere as far as PR is resolved.
I'll put xfail into the test.
Which will lead to (in g++.log):
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__'
attribute does not apply to types\
  [-Wattributes]^M
output is:
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__'
attribute does not apply to types\
  [-Wattributes]^M

XFAIL: c-c++-common/attr-simd-3.c  -std=gnu++98 PR68158 (test for
errors, line 5)
FAIL: c-c++-common/attr-simd-3.c  -std=gnu++98 (test for excess errors)
Excess errors:
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__'
attribute does not apply to types\
  [-Wattributes]

Patch in the bottom.

gcc/tessuite/
* c-c++-common/attr-simd-3.c: Put xfail (PR68158) on dg-error.


This test fails on bare-metal targets that don't support -fcilkplus or
-pthread.
Would you consider moving them to the cilkplus testing directory or
adding an appropriate
effective target check?
I think an effective-target-check is the most appropriate.  The whole 
point here is to make that attribute independent of Cilk support.


jeff



Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-11-20 Thread Ilya Verbin
On Wed, Dec 10, 2014 at 01:48:21 +0300, Ilya Verbin wrote:
> On 09 Dec 14:59, Richard Biener wrote:
> > On Mon, 8 Dec 2014, Ilya Verbin wrote:
> > > Unfortunately, this fix was not general enough.
> > > There might be cases when mixed object files get into lto-wrapper, ie 
> > > some of
> > > them contain only LTO sections, some contain only offload sections, and 
> > > some
> > > contain both.  But when lto-wrapper will pass all these files to 
> > > recompilation,
> > > the compiler might crash (it depends on the order of input files), since 
> > > in
> > > read_cgraph_and_symbols it expects that *all* input files contain IR 
> > > section of
> > > given type.
> > > This patch splits input objects from argv into lto_argv and offload_argv, 
> > > so
> > > that all files in arrays contain corresponding IR.
> > > Similarly, in lto-plugin, it was bad idea to add objects, which contain 
> > > offload
> > > IR without LTO, to claimed_files, since this may corrupt a resolution 
> > > file.
> > > 
> > > Tested on various combinations of files with/without -flto and 
> > > with/without
> > > offload, using trunk ld and gold, also tested on ld without plugin 
> > > support.
> > > Bootstrap and make check passed on x86_64-linux and i686-linux.  Ok for 
> > > trunk?
> > 
> > Did you check that bootstrap-lto still works?  Ok if so.
> 
> Yes, bootstrap-lto passed.
> Committed revision 218543.

I don't know how I missed this a year ago, but mixing of LTO objects with
offloading-without-LTO objects still doesn't work :(
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 filed about that.
Any thoughts how to fix this?

Thanks,
  -- Ilya


[commit] [patch] Python Pretty Printers get disabled on libstdc++ reload by GDB (PR libstdc++/68448)

2015-11-20 Thread Jan Kratochvil
On Fri, 20 Nov 2015 18:40:46 +0100, Jonathan Wakely wrote:
> The patch is OK for trunk and gcc-5-branch.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68448
trunk:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=230669
5.x:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=230670


Jan


Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-20 Thread Jeff Law

On 11/20/2015 07:08 AM, Bernd Schmidt wrote:


BZ27313 is marked as fixed by the introduction of the tree cselim pass,
thus the problem won't even be seen at RTL level.

Cool.


I'm undecided on whether cs-elim is safe wrt the store speculation vs
locks concerns raised in the thread discussing Ian's
noce_can_store_speculate_p, but that's not something we have to consider
to solve the problem at hand.

I don't think cs-elim is safe WRT locks and such in multi-threaded code.

   In particular it replaces this:

 bb0:
   if (cond) goto bb2; else goto bb1;
 bb1:
   *p = RHS;
 bb2:

   with

 bb0:
   if (cond) goto bb1; else goto bb2;
 bb1:
   condtmp' = *p;
 bb2:
   condtmp = PHI 
   *p = condtmp;


If *p is a shared memory location, then there may be another writer.  If 
that writer happens to store something in that location after the load 
of *p, but before the store to *p, then that store will get lost in the 
transformed pseudo code.


That seems to introduce a data race.  Presumably one would call the 
original code ill-formed WRT the C11/C++11 memory model since the shared 
location is not marked as such.


I'm willing to consider this an independent problem.





As far as I can tell hmmer and the 27313 testcase are unaffected at -O2
(if anything, hmmer was very slightly faster afterwards). The run wasn't
super-scientific, but I wouldn't have expected anything else given the
existence of cs-elim.
Cool.  It doesn't have to be super-scientific.  Good to see it didn't 
harm anything -- thanks for going the extra mile on this one.




Ok to do the above, removing all the bits made unnecessary (including
memory_must_be_modified_in_insn_p in alias.c)?

Yup.  Zap it.
jeff



Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-20 Thread H.J. Lu
On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
 wrote:
> On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:
>> Empty record should be returned and passed the same way in C and C++.
>> This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
>> defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is defined
>> to is_really_empty_class, which returns true for C++ empty classes.  For
>> LTO, we stream out a bit to indicate if a record is empty and we store
>> it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
>> changed to set bitsize to 0 for empty records.  Middle-end and x86
>> backend are updated to ignore empty records for parameter passing and
>> function value return.  Other targets may need similar changes.
>
> Please avoid a new langhook for this and instead claim a bit in 
> tree_type_common
> like for example restrict_flag (double-check it is unused for non-pointers).

There is no bit in tree_type_common I can overload.  restrict_flag is
checked for non-pointers to issue an error when it is used on non-pointers:

/export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
   typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
qualifiers cannot" "" }

Should I add a bit to tree_type_common?  It may crease the size
of tree_type_common by 4 bytes since there is unused bits in
tree_type_common.

> I don't like that you need to modify targets - those checks should be done
> in the caller (which may just use a new wrapper with the logic and then
> dispatching to the actual hook).

I can do that.

> Why do you need do adjust get_ref_base_and_extent?

get_ref_base_and_extent is changed to set bitsize to 0 for empty records
so that when ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
get 0 as the maximum size on empty record.  Otherwise, find_tail_calls
won't perform tail call optimization for functions with empty record
parameters, as in

struct dummy { };
struct true_type { struct dummy i; };

extern true_type y;
extern void xxx (true_type c);

void
yyy (void)
{
  xxx (y);
}

-- 
H.J.


Merge from trunk to gccgo branch

2015-11-20 Thread Ian Lance Taylor
I merged trunk revision 230657 to the gccgo branch.

Ian


Re: [PATCH] Add clang-format config to contrib folder

2015-11-20 Thread Jeff Law

On 11/20/2015 04:33 AM, Martin Liška wrote:

On 11/19/2015 06:43 PM, Pedro Alves wrote:

On 11/19/2015 12:54 PM, Martin Liška wrote:

  ContinuationIndentWidth: 2
-ForEachMacros: 
['_FOR_EACH','_FOR_EACH_1','FOR_EACH_AGGR_INIT_EXPR_ARG','FOR_EACH_ALIAS','FOR_EACH_ALLOCNO','FOR_EACH_ALLOCNO_OBJECT','FOR_EACH_ARTIFICIAL_DEF','FOR_EACH_ARTIFICIAL_USE','FOR_EACH_BB_FN','FOR_EACH_BB_REVERSE_FN','FOR_EACH_BIT_IN_MINMAX_SET','FOR_EACH_CALL_EXPR_ARG','FOR_EACH_CLONE','FOR_EACH_CONST_CALL_EXPR_ARG','FOR_EACH_CONSTRUCTOR_ELT','FOR_EACH_CONSTRUCTOR_VALUE','FOR_EACH_COPY','FOR_EACH_DEF','FOR_EACH_DEFINED_FUNCTION','FOR_EACH_DEFINED_SYMBOL','FOR_EACH_DEFINED_VARIABLE','FOR_EACH_DEP','FOR_EACH_EDGE','FOR_EACH_EXPR','FOR_EACH_EXPR_1','FOR_EACH_FUNCTION','FOR_EACH_FUNCTION_WITH_GIMPLE_BODY','FOR_EACH_HASH_TABLE_ELEMENT','FOR_EACH_IMM_USE_FAST','FOR_EACH_IMM_USE_ON_STMT','FOR_EACH_IMM_USE_STMT','FOR_EACH_INSN','FOR_EACH_INSN_1','FOR_EACH_INSN_DEF','FOR_EACH_INSN_EQ_USE','FOR_EACH_INSN_INFO_DEF','FOR_EACH_INSN_INFO_EQ_USE','FOR_EACH_INSN_INFO_MW','FOR_EACH_INSN_INFO_USE','FOR_EACH_INSN_USE','FOR_EACH_LOCAL_DECL','FOR_EACH_LOOP','FOR_EACH_LOOP_FN','F!

O!

  R!

  _EACH_OBJE
CT','FOR_EACH_OBJECT_CONFLICT','FOR_EACH_PHI_ARG','FOR_EACH_PHI_OR_STMT_DEF','FOR_EACH_PHI_OR_STMT_USE','FOR_EACH_PREF','FOR_EACH_SCALAR','FOR_EACH_SSA_DEF_OPERAND','FOR_EACH_SSA_TREE_OPERAND','FOR_EACH_SSA_USE_OPERAND','FOR_EACH_STATIC_INITIALIZER','FOR_EACH_SUBRTX','FOR_EACH_SUBRTX_PTR','FOR_EACH_SUBRTX_VAR','FOR_EACH_SUCC','FOR_EACH_SUCC_1','FOR_EACH_SYMBOL','FOR_EACH_VARIABLE','FOR_EACH_VEC_ELT','FOR_EACH_VEC_ELT_FROM','FOR_EACH_VEC_ELT_REVERSE','FOR_EACH_VEC_SAFE_ELT','FOR_EACH_VEC_SAFE_ELT_REVERSE','_FOR_EACH_X','_FOR_EACH_X_1','FOREACH_FUNCTION_ARGS','FOREACH_FUNCTION_ARGS_PTR']

+ForEachMacros: 
['FOR_ALL_BB_FN','FOR_ALL_EH_REGION','FOR_ALL_EH_REGION_AT','FOR_ALL_EH_REGION_FN','FOR_ALL_INHERITED_FIELDS','FOR_ALL_PREDICATES','FOR_BB_BETWEEN','FOR_BB_INSNS','FOR_BB_INSNS_REVERSE','FOR_BB_INSNS_REVERSE_SAFE','FOR_BB_INSNS_SAFE','FOR_BODY','FOR_COND','FOR_EACH_AGGR_INIT_EXPR_ARG','FOR_EACH_ALIAS','FOR_EACH_ALLOCNO','FOR_EACH_ALLOCNO_OBJECT','FOR_EACH_ARTIFICIAL_DEF','FOR_EACH_ARTIFICIAL_USE','FOR_EACH_BB_FN','FOR_EACH_BB_REVERSE_FN','FOR_EACH_BIT_IN_MINMAX_SET','FOR_EACH_CALL_EXPR_ARG','FOR_EACH_CLONE','FOR_EACH_CONST_CALL_EXPR_ARG','FOR_EACH_CONSTRUCTOR_ELT','FOR_EACH_CONSTRUCTOR_VALUE','FOR_EACH_COPY','FOR_EACH_DEF','FOR_EACH_DEFINED_FUNCTION','FOR_EACH_DEFINED_SYMBOL','FOR_EACH_DEFINED_VARIABLE','FOR_EACH_DEP','FOR_EACH_EDGE','FOR_EACH_EXPR','FOR_EACH_EXPR_1','FOR_EACH_FUNCTION','FOREACH_FUNCTION_ARGS','FOREACH_FUNCTION_ARGS_PTR','FOR_EACH_FUNCTION_WITH_GIMPLE_BODY','FOR_EACH_HASH_TABLE_ELEMENT','FOR_EACH_IMM_USE_FAST','FOR_EACH_IMM_USE_ON_STMT','!

F!

  O!

  R_EACH_IMM
_USE_STMT','FOR_EACH_INSN','FOR_EACH_INSN_1','FOR_EACH_INSN_DEF','FOR_EACH_INSN_EQ_USE','FOR_EACH_INSN_INFO_DEF','FOR_EACH_INSN_INFO_EQ_USE','FOR_EACH_INSN_INFO_MW','FOR_EACH_INSN_INFO_USE','FOR_EACH_INSN_USE','FOR_EACH_LOCAL_DECL','FOR_EACH_LOOP','FOR_EACH_LOOP_FN','FOR_EACH_OBJECT','FOR_EACH_OBJECT_CONFLICT','FOR_EACH_PHI_ARG','FOR_EACH_PHI_OR_STMT_DEF','FOR_EACH_PHI_OR_STMT_USE','FOR_EACH_PREF','FOR_EACH_SCALAR','FOR_EACH_SSA_DEF_OPERAND','FOR_EACH_SSA_TREE_OPERAND','FOR_EACH_SSA_USE_OPERAND','FOR_EACH_STATIC_INITIALIZER','FOR_EACH_SUBRTX','FOR_EACH_SUBRTX_PTR','FOR_EACH_SUBRTX_VAR','FOR_EACH_SUCC','FOR_EACH_SUCC_1','FOR_EACH_SYMBOL','FOR_EACH_VARIABLE','FOR_EACH_VEC_ELT','FOR_EACH_VEC_ELT_FROM','FOR_EACH_VEC_ELT_REVERSE','FOR_EACH_VEC_SAFE_ELT','FOR_EACH_VEC_SAFE_ELT_REVERSE','FOR_EXPR','FOR_INIT_STMT','FOR_SCOPE']

  IndentCaseLabels: false


I don't know the tool's syntax here, but it's usually much better to
keep entries like these one per line (or at least a few only).  Then
changes to the list are _much_ easier to review and maintain, like:

...
  ForEachMacros: [
  'FOR_ALL_BB_FN',
  'FOR_ALL_EH_REGION',
-'FOR_ALL_EH_REGION_AT',
-'FOR_ALL_EH_REGION',
+'FOR_ALL_EH_WHATNOT,
+'FOR_ALL_EH_WHATNOT_AT,
  'FOR_ALL_INHERITED_FIELDS',
  'FOR_ALL_PREDICATES',
  'FOR_BB_BETWEEN',
  'FOR_BB_INSNS',
...

vs a single-line hunk that's basically unintelligible.

Thanks,
Pedro Alves



Hi Pedro.

Fully agree with you, there's suggested patch.
Hope I can install the patch for trunk?
Yes.  In fact, I think that as long as you're moving towards coercing 
clang-format to handle GNU style that you should have a free reign to 
make changes to this config file.


jeff


Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Manuel López-Ibáñez
On 20 November 2015 at 17:42, Jeff Law  wrote:
> So we have to detangle the operand shortening from warning detection. Kai's
> idea was to first make the shortening code "pure" in the sense that it would
> have no side effects other than to generate the warnings.  Canonicalization
> and other transformations would still occur internally, but not be reflected
> in the IL.
>
> That was the overall plan and he posted a patch for that.  But that patch
> didn't do the due diligence to verify that once the shortening code was made
> "pure" that we didn't regress on the quality of the code we generated.

I thought that the original plan was to make the warning code also use
match.pd. That is, that all folding, including FE folding, will be
match.pd-based. Has this changed?

Cheers,

Manuel.


Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Jeff Law

On 11/19/2015 12:02 PM, Manuel López-Ibáñez wrote:

On 19 November 2015 at 17:54, Jeff Law  wrote:

But there were a couple of patches from you some time ago, for
example: http://permalink.gmane.org/gmane.comp.gcc.patches/343476

What happened with those?


On hold pending fixing the type-limits warning placement.  Essentially that
has to be untangled first.


Could you elaborate on this? Or point me to some previous email
thread? (I don't have enough free time to follow the mailing list
anymore, sorry).
I don't have the thread handy, but essentially the operand shortening 
code has warning bits inside it.  As a result pulling out functionality 
(such as operand canonicalization) and putting into match.pd results in 
missing a warning -- ie a regression.


So we have to detangle the operand shortening from warning detection. 
Kai's idea was to first make the shortening code "pure" in the sense 
that it would have no side effects other than to generate the warnings. 
 Canonicalization and other transformations would still occur 
internally, but not be reflected in the IL.


That was the overall plan and he posted a patch for that.  But that 
patch didn't do the due diligence to verify that once the shortening 
code was made "pure" that we didn't regress on the quality of the code 
we generated.


jeff



Re: [patch] Python Pretty Printers get disabled on libstdc++ reload by GDB (PR libstdc++/68448)

2015-11-20 Thread Jonathan Wakely

On 20/11/15 18:28 +0100, Jan Kratochvil wrote:

On Thu, 19 Nov 2015 21:21:51 +0100, Jan Kratochvil wrote:

* python/hook.in: Call register_libstdcxx_printers.
* python/libstdcxx/v6/__init__.py: Wrap it to
register_libstdcxx_printers.

[...]

-import libstdcxx.v6
+# Call a function as a plain import would not execute body of the included file
+# on repeated reloads of this object file.
+from libstdcxx.v6 import register_libstdcxx_printers
+register_libstdcxx_printers(gdb.current_objfile())


Jonathan Wakely mentioned:
[libstdc++] Refactor python/hook.in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02518.html
From: Siva Chandra 
Message-ID: 

https://gcc.gnu.org/r215726

That change introduced this regression.  This patch does not undo it, goal of
the patch by Siva Chandra - minimizing hook.in - stays achieved.


OK, thanks for checking, Jan.

The patch is OK for trunk and gcc-5-branch.



Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address

2015-11-20 Thread Richard Earnshaw
On 20/11/15 08:31, Bin.Cheng wrote:
> On Thu, Nov 19, 2015 at 10:32 AM, Bin.Cheng  wrote:
>> On Tue, Nov 17, 2015 at 6:08 PM, James Greenhalgh
>>  wrote:
>>> On Tue, Nov 17, 2015 at 05:21:01PM +0800, Bin Cheng wrote:
 Hi,
 GIMPLE IVO needs to call backend interface to calculate costs for addr
 expressions like below:
FORM1: "r73 + r74 + 16380"
FORM2: "r73 << 2 + r74 + 16380"

 They are invalid address expression on AArch64, so will be legitimized by
 aarch64_legitimize_address.  Below are what we got from that function:

 For FORM1, the address expression is legitimized into below insn sequence
 and rtx:
r84:DI=r73:DI+r74:DI
r85:DI=r84:DI+0x3000
r83:DI=r85:DI
"r83 + 4092"

 For FORM2, the address expression is legitimized into below insn sequence
 and rtx:
r108:DI=r73:DI<<0x2
r109:DI=r108:DI+r74:DI
r110:DI=r109:DI+0x3000
r107:DI=r110:DI
"r107 + 4092"

 So the costs computed are 12/16 respectively.  The high cost prevents IVO
 from choosing right candidates.  Besides cost computation, I also think the
 legitmization is bad in terms of code generation.
 The root cause in aarch64_legitimize_address can be described by it's
 comment:
/* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask),
   where mask is selected by alignment and size of the offset.
   We try to pick as large a range for the offset as possible to
   maximize the chance of a CSE.  However, for aligned addresses
   we limit the range to 4k so that structures with different sized
   elements are likely to use the same base.  */
 I think the split of CONST is intended for REG+CONST where the const offset
 is not in the range of AArch64's addressing modes.  Unfortunately, it
 doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<>>> when the CONST are in the range of addressing modes.  As a result, these 
 two
 cases fallthrough this logic, resulting in sub-optimal results.

 It's obvious we can do below legitimization:
 FORM1:
r83:DI=r73:DI+r74:DI
"r83 + 16380"
 FORM2:
r107:DI=0x3ffc
r106:DI=r74:DI+r107:DI
   REG_EQUAL r74:DI+0x3ffc
"r106 + r73 << 2"

 This patch handles these two cases as described.
>>>
>>> Thanks for the description, it made the patch very easy to review. I only
>>> have a style comment.
>>>
 Bootstrap & test on AArch64 along with other patch.  Is it OK?

 2015-11-04  Bin Cheng  
   Jiong Wang  

   * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle
   address expressions like REG+REG+CONST and REG+NON_REG+CONST.
>>>
 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 5c8604f..47875ac 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  
 */, machine_mode mode)
  {
HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
HOST_WIDE_INT base_offset;
 +  rtx op0 = XEXP (x,0);
 +
 +  if (GET_CODE (op0) == PLUS)
 + {
 +   rtx op0_ = XEXP (op0, 0);
 +   rtx op1_ = XEXP (op0, 1);
>>>
>>> I don't see this trailing _ on a variable name in many places in the source
>>> tree (mostly in the Go frontend), and certainly not in the aarch64 backend.
>>> Can we pick a different name for op0_ and op1_?
>>>
 +
 +   /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will
 +  reach here, the 'CONST' may be valid in which case we should
 +  not split.  */
 +   if (REG_P (op0_) && REG_P (op1_))
 + {
 +   machine_mode addr_mode = GET_MODE (op0);
 +   rtx addr = gen_reg_rtx (addr_mode);
 +
 +   rtx ret = plus_constant (addr_mode, addr, offset);
 +   if (aarch64_legitimate_address_hook_p (mode, ret, false))
 + {
 +   emit_insn (gen_adddi3 (addr, op0_, op1_));
 +   return ret;
 + }
 + }
 +   /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
 +  will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
 +  we split it into Y=REG+CONST, Y+NON_REG.  */
 +   else if (REG_P (op0_) || REG_P (op1_))
 + {
 +   machine_mode addr_mode = GET_MODE (op0);
 +   rtx addr = gen_reg_rtx (addr_mode);
 +
 +   /* Switch to make sure that register is in op0_.  */
 +   if (REG_P (op1_))
 + std::swap (op0_, op1_);
 +
 +   rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
 +   if (aarch64_legitimate_address_hook_p (mode, ret, false))
 +

Re: [Patch, vrp] Allow VRP type conversion folding only for widenings upto word mode

2015-11-20 Thread Jeff Law

On 11/20/2015 10:04 AM, Senthil Kumar Selvaraj wrote:

On Thu, Nov 19, 2015 at 10:31:41AM -0700, Jeff Law wrote:

On 11/18/2015 11:20 PM, Senthil Kumar Selvaraj wrote:

On Wed, Nov 18, 2015 at 09:36:21AM +0100, Richard Biener wrote:


Otherwise ok.


See modified patch below. If you think vrp98.c is unnecessary, feel free
to dump it :).

If ok, could you commit it for me please? I don't have commit access.

Regards
Senthil

gcc/ChangeLog
2015-11-19  Senthil Kumar Selvaraj  

* tree.h (desired_pro_or_demotion_p): New function.
* tree-vrp.c (simplify_cond_using_ranges): Call it.

gcc/testsuite/ChangeLog
2015-11-19  Senthil Kumar Selvaraj  

* gcc.dg/tree-ssa/vrp98.c: New testcase.
* gcc.target/avr/uint8-single-reg.c: New testcase.

I went ahead and committed this as-is.

I do think the vrp98 testcase is useful as it verifies that VRP is doing
what we want in a target independent way.  It's a good complement to the AVR
specific testcase.


I see the same problem on gcc-5-branch as well. Would it be ok to
backport the fix to that branch as well?
That's a call for the release managers.  I typically don't backport 
anything expect ICE or incorrect code generation fixes as I tend to be 
very conservative on what goes onto a release branch.


Jakub, Richi or Joseph would need to ack into a release branch.

jeff



Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-20 Thread Jeff Law

On 11/20/2015 02:19 AM, Nick Clifton wrote:

Hi Jeff,


The code there would solve this problem, but the approach is is overly
cautious, since it disables the optimization for all extensions that
increase the number of hard registers used.  Some of these will be
viable candidates, provided that the extra hard registers are no used.
(This is certainly true for the RL78, where the (patched) optimization
does improve code, even though the widening does use extra registers).

Nick -- can you pass along your testcode?


Sure - this is for the RL78 toolchain.  In theory the problem is
generic, but I have not tested other toolchains.
Right.  It just really helps me to have something I can poke at -- it's 
just the type of learner I am.  I tried to trip the test in that #if 
several times last year and never came up with anything.  So naturally 
if we can trip it with the rl78 I want to dig into it.





Compile the gcc.c-torture/execute/pr42833.c or
gcc.c-torture/execute/strct-stdarg-1.c tests at -O1 or higher with -free
also specified on the command line.  Without -free these tests pass.
With -free they fail.

Perfect.  Building rl78-elf cross bits as I type...

jeff


Re: [patch] Python Pretty Printers get disabled on libstdc++ reload by GDB (PR libstdc++/68448)

2015-11-20 Thread Jan Kratochvil
On Thu, 19 Nov 2015 21:21:51 +0100, Jan Kratochvil wrote:
>   * python/hook.in: Call register_libstdcxx_printers.
>   * python/libstdcxx/v6/__init__.py: Wrap it to
>   register_libstdcxx_printers.
[...]
> -import libstdcxx.v6
> +# Call a function as a plain import would not execute body of the included 
> file
> +# on repeated reloads of this object file.
> +from libstdcxx.v6 import register_libstdcxx_printers
> +register_libstdcxx_printers(gdb.current_objfile())

Jonathan Wakely mentioned:
[libstdc++] Refactor python/hook.in
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02518.html
From: Siva Chandra 
Message-ID: 

https://gcc.gnu.org/r215726

That change introduced this regression.  This patch does not undo it, goal of
the patch by Siva Chandra - minimizing hook.in - stays achieved.


Jan


Re: [PATCH] Fix PR68067

2015-11-20 Thread Alan Lawrence
On 6 November 2015 at 10:39, Richard Biener  wrote:
>> ../spec2000/benchspec/CINT2000/254.gap/src/polynom.c:358:11: error: location
>> references block not in block tree
>> l1_279 = PHI <1(28), l1_299(33)>
>
> ^^^
>
> this is the error to look at!  It means that the GC heap will be corrupted
> quite easily.
>

This looked very similar to PR68117 - the invalid phi arg, and block
not in  block-tree, even if not the invalid tree code - and as the
posters there were having success with valgrind, whereas I wasn't, I
watched and waited. First observation is that it triggers the asserts
you suggested in comment 27
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D68117#c27). Indeed, it
fails those asserts, even after the patch in comment 25 (committed as
r230594) to tree-ssa.c (delete_tree_ssa), and the patch in comment#35
to function.c (set_cfun), and the patch in comment#30 (committed as
r230424) to cfgexpand.c (pass_expand::execute).

The patch in comment#29 (which replaces the asserts in comment#27 with
empties), however, fixes the problem - although I can't rule out, that
that's just by changing the memory allocation pattern.

Moreover, if I take those patches and rebase onto a recent trunk (onto
which the delete_tree_ssa and pass_expand::execute patches have
already been committed), i.e. just adding the assertions from
comment#27 and the call in function.c (set_cfun) - the assertions are
still failing on my testcase, whereas the original (assertionless)
failure was very erratic, and had since disappeared/been hidden on
trunk. Indeed those same assertions break in a few other places (even
in a --disable-bootstrap build after gcc/xgcc is built), so I feel I
have a good chance of producing a reasonable assertion-breaking
testcase.

So I have to ask, how sure are you that those assertions are(/should
be!) "correct"? :)

--Alan


Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Manuel López-Ibáñez
On 20 November 2015 at 16:10, Martin Sebor  wrote:
>>> Hmm, it looks like using expansion_point_if_in_system_header might avoid
>>> the first issue you mention.
>>
>>
>> Thus.
>
>
> Great, thanks! (I'll have to remember the trick for my own use!)

I added this to  https://gcc.gnu.org/wiki/DiagnosticsGuidelines under
"Locations" for future reference. I hope others would do the same in
the future, so the info is kept up-to-date.

Cheers,

Manuel.


Re: [PATCH 4/4][AArch64] Add cost model for Exynos M1

2015-11-20 Thread James Greenhalgh
On Thu, Nov 19, 2015 at 04:06:17PM -0600, Evandro Menezes wrote:
> On 11/05/2015 06:09 PM, Evandro Menezes wrote:
> >2015-10-25  Evandro Menezes 
> >
> >   gcc/
> >
> >   * config/aarch64/aarch64-cores.def: Use the Exynos M1 cost model.
> >   * config/aarch64/aarch64.c (exynosm1_addrcost_table): New
> >variable.
> >   (exynosm1_regmove_cost): Likewise.
> >   (exynosm1_vector_cost): Likewise.
> >   (exynosm1_tunings): Likewise.
> >   * config/arm/aarch-cost-tables.h (exynosm1_extra_costs): Likewise.
> >   * config/arm/arm.c (arm_exynos_m1_tune): Likewise.
> >
> >This patch adds the cost model for Exynos M1.  This patch depends
> >on a couple of previous patches though,
> >https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00505.html and
> >https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00538.html
> >
> >Please, commit if it's alright.
> 
> Ping.

This is OK from an AArch64 perspective. Please wait for an OK from an
ARM reviewer.

Thanks,
James



Re: [PATCH 3b/4][AArch64] Add scheduling model for Exynos M1

2015-11-20 Thread James Greenhalgh
On Tue, Nov 10, 2015 at 11:54:00AM -0600, Evandro Menezes wrote:
>2015-11-10  Evandro Menezes 
> 
>gcc/
> 
>* config/aarch64/aarch64-cores.def: Use the Exynos M1 sched model.
>* config/aarch64/aarch64.md: Include "exynos-m1.md".
>* config/arm/arm-cores.def: Use the Exynos M1 sched model.
>* config/arm/arm.md: Include "exynos-m1.md".
>* config/arm/arm-tune.md: Regenerated.
>* config/arm/exynos-m1.md: New file.
> 
> This patch adds the scheduling model for Exynos M1.  It depends on
> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01257.html
> 
> Bootstrapped on arm-unknown-linux-gnueabihf, aarch64-unknown-linux-gnu.
> 
> Please, commit if it's alright.


> From 0b7b6d597e5877c78c4d88e0d4491858555a5364 Mon Sep 17 00:00:00 2001
> From: Evandro Menezes 
> Date: Mon, 9 Nov 2015 17:18:52 -0600
> Subject: [PATCH 2/2] [AArch64] Add scheduling model for Exynos M1
> 
> gcc/
>   * config/aarch64/aarch64-cores.def: Use the Exynos M1 sched model.
>   * config/aarch64/aarch64.md: Include "exynos-m1.md".

These changes are fine.

>   * config/arm/arm-cores.def: Use the Exynos M1 sched model.
>   * config/arm/arm.md: Include "exynos-m1.md".
>   * config/arm/arm-tune.md: Regenerated.

These changes need an ack from an ARM reviewer.

>   * config/arm/exynos-m1.md: New file.

I have a few comments on this model.

> +;; The Exynos M1 core is modeled as a triple issue pipeline that has
> +;; the following functional units.
> +
> +(define_automaton "exynos_m1_gp")
> +(define_automaton "exynos_m1_ls")
> +(define_automaton "exynos_m1_fp")
> +
> +;; 1.  Two pipelines for simple integer operations: A, B
> +;; 2.  One pipeline for simple or complex integer operations: C
> +
> +(define_cpu_unit "em1_xa, em1_xb, em1_xc" "exynos_m1_gp")
> +
> +(define_reservation "em1_alu" "(em1_xa | em1_xb | em1_xc)")
> +(define_reservation "em1_c" "em1_xc")

Is this extra reservation useful, can we not just use em1_xc directly?

> +;; 3.  Two asymmetric pipelines for Neon and FP operations: F0, F1
> +
> +(define_cpu_unit "em1_f0, em1_f1" "exynos_m1_fp")
> +
> +(define_reservation "em1_fmac" "em1_f0")
> +(define_reservation "em1_fcvt" "em1_f0")
> +(define_reservation "em1_nalu" "(em1_f0 | em1_f1)")
> +(define_reservation "em1_nalu0" "em1_f0")
> +(define_reservation "em1_nalu1" "em1_f1")
> +(define_reservation "em1_nmisc" "em1_f0")
> +(define_reservation "em1_ncrypt" "em1_f0")
> +(define_reservation "em1_fadd" "em1_f1")
> +(define_reservation "em1_fvar" "em1_f1")
> +(define_reservation "em1_fst" "em1_f1")

Same comment here, does this not just obfuscate the interaction between
instruction classes in the description. I'm not against doing it this way
if you prefer, but it would seem to reduce readability to me. I think there
is also an argument that this increases readability, so it is your choice.

> +
> +;; 4.  One pipeline for branch operations: BX
> +
> +(define_cpu_unit "em1_bx" "exynos_m1_gp")
> +
> +(define_reservation "em1_br" "em1_bx")
> +

And again?

> +;; 5.  One AGU for loads: L
> +;; One AGU for stores and one pipeline for stores: S, SD
> +
> +(define_cpu_unit "em1_lx" "exynos_m1_ls")
> +(define_cpu_unit "em1_sx, em1_sd" "exynos_m1_ls")
> +
> +(define_reservation "em1_ld" "em1_lx")
> +(define_reservation "em1_st" "(em1_sx + em1_sd)")
> +
> +;; Common occurrences
> +(define_reservation "em1_sfst" "(em1_fst + em1_st)")
> +(define_reservation "em1_lfst" "(em1_fst + em1_ld)")
> +
> +;; Branches
> +;;
> +;; No latency as there is no result
> +;; TODO: Unconditional branches use no units;
> +;; conditional branches add the BX unit;
> +;; indirect branches add the C unit.
> +(define_insn_reservation "exynos_m1_branch" 0
> +  (and (eq_attr "tune" "exynosm1")
> +   (eq_attr "type" "branch"))
> +  "em1_br")
> +
> +(define_insn_reservation "exynos_m1_call" 1
> +  (and (eq_attr "tune" "exynosm1")
> +   (eq_attr "type" "call"))
> +  "em1_alu")
> +
> +;; Basic ALU
> +;;
> +;; Simple ALU without shift, non-predicated
> +(define_insn_reservation "exynos_m1_alu" 1
> +  (and (eq_attr "tune" "exynosm1")
> +   (and (not (eq_attr "predicated" "yes"))

(and (eq_attr "predicated" "no")) ?

Likewise throughout the file? Again this is your choice.

This is OK from the AArch64 side, let me know if you plan to change any
of the above, otherwise I'll commit it (or someone else can commit it)
after I see an OK from an ARM reviewer.

Thanks,
James



Re: [Patch, vrp] Allow VRP type conversion folding only for widenings upto word mode

2015-11-20 Thread Senthil Kumar Selvaraj
On Thu, Nov 19, 2015 at 10:31:41AM -0700, Jeff Law wrote:
> On 11/18/2015 11:20 PM, Senthil Kumar Selvaraj wrote:
> >On Wed, Nov 18, 2015 at 09:36:21AM +0100, Richard Biener wrote:
> >>
> >>Otherwise ok.
> >
> >See modified patch below. If you think vrp98.c is unnecessary, feel free
> >to dump it :).
> >
> >If ok, could you commit it for me please? I don't have commit access.
> >
> >Regards
> >Senthil
> >
> >gcc/ChangeLog
> >2015-11-19  Senthil Kumar Selvaraj  
> >
> > * tree.h (desired_pro_or_demotion_p): New function.
> > * tree-vrp.c (simplify_cond_using_ranges): Call it.
> >
> >gcc/testsuite/ChangeLog
> >2015-11-19  Senthil Kumar Selvaraj  
> >
> > * gcc.dg/tree-ssa/vrp98.c: New testcase.
> > * gcc.target/avr/uint8-single-reg.c: New testcase.
> I went ahead and committed this as-is.
> 
> I do think the vrp98 testcase is useful as it verifies that VRP is doing
> what we want in a target independent way.  It's a good complement to the AVR
> specific testcase.

I see the same problem on gcc-5-branch as well. Would it be ok to
backport the fix to that branch as well?

Regards
Senthil
> 
> Thanks,
> Jeff
> 


Re: [PATCH][GCC][ARM] Disable neon testing for armv7-m

2015-11-20 Thread Andre Vieira

Hi Kyrill
On 20/11/15 11:51, Kyrill Tkachov wrote:

Hi Andre,

On 18/11/15 09:44, Andre Vieira wrote:

On 17/11/15 10:10, James Greenhalgh wrote:

On Mon, Nov 16, 2015 at 01:15:32PM +, Andre Vieira wrote:

On 16/11/15 12:07, James Greenhalgh wrote:

On Mon, Nov 16, 2015 at 10:49:11AM +, Andre Vieira wrote:

Hi,

   This patch changes the target support mechanism to make it
recognize any ARM 'M' profile as a non-neon supporting target. The
current check only tests for armv6 architectures and earlier, and
does not account for armv7-m.

   This is correct because there is no 'M' profile that supports neon
and the current test is not sufficient to exclude armv7-m.

   Tested by running regressions for this testcase for various ARM
targets.

   Is this OK to commit?

   Thanks,
   Andre Vieira

gcc/testsuite/ChangeLog:
2015-11-06  Andre Vieira 

 * gcc/testsuite/lib/target-supports.exp
   (check_effective_target_arm_neon_ok_nocache): Added check
   for M profile.



 From 2c53bb9ba3236919ecf137a4887abf26d4f7fda2 Mon Sep 17 00:00:00
2001
From: Andre Simoes Dias Vieira 
Date: Fri, 13 Nov 2015 11:16:34 +
Subject: [PATCH] Disable neon testing for armv7-m

---
  gcc/testsuite/lib/target-supports.exp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp
b/gcc/testsuite/lib/target-supports.exp
index
75d506829221e3d02d454631c4bd2acd1a8cedf2..8097a4621b088a93d58d09571cf7aa27b8d5fba6
100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2854,7 +2854,7 @@ proc
check_effective_target_arm_neon_ok_nocache { } {
  int dummy;
  /* Avoid the case where a test adds -mfpu=neon, but the
toolchain is
 configured for -mcpu=arm926ej-s, for example.  */
-#if __ARM_ARCH < 7
+#if __ARM_ARCH < 7 || __ARM_ARCH_PROFILE == 'M'
  #error Architecture too old for NEON.


Could you fix this #error message while you're here?

Why we can't change this test to look for the __ARM_NEON macro from
ACLE:

#if __ARM_NEON < 1
   #error NEON is not enabled
#endif

Thanks,
James



There is a check for this already:
'check_effective_target_arm_neon'. I think the idea behind
arm_neon_ok is to check whether the hardware would support neon,
whereas arm_neon is to check whether neon was enabled, i.e.
-mfpu=neon was used or a mcpu was passed that has neon enabled by
default.

The comments for 'check_effective_target_arm_neon_ok_nocache'
highlight this, though maybe the comments for
check_effective_target_arm_neon could be better.

# Return 1 if this is an ARM target supporting -mfpu=neon
# -mfloat-abi=softfp or equivalent options.  Some multilibs may be
# incompatible with these options.  Also set et_arm_neon_flags to the
# best options to add.

proc check_effective_target_arm_neon_ok_nocache
...
/* Avoid the case where a test adds -mfpu=neon, but the toolchain is
configured for -mcpu=arm926ej-s, for example.  */
...


and

# Return 1 if this is a ARM target with NEON enabled.

proc check_effective_target_arm_neon


OK, got it - sorry for my mistake, I had the two procs confused.

I'd still like to see the error message fixed "Architecture too old
for NEON."
is not an accurate description of the problem.

Thanks,
James



This OK?



This is ok,
I've committed for you with the slightly tweaked ChangeLog entry:
2015-11-20  Andre Vieira  

 * lib/target-supports.exp
 (check_effective_target_arm_neon_ok_nocache): Add check
 for M profile.

as r230653.

Thanks,
Kyrill



Cheers,
Andre




Thank you. Would there be any objections to backporting this to 
gcc-5-branch? I checked, it applies cleanly and its a simple enough way 
of preventing a lot of FAILS for armv7-m.


Best Regards,
Andre



Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-20 Thread Tom de Vries

On 20/11/15 14:29, Richard Biener wrote:

I agree it's somewhat of an odd behavior but all passes should
either be placed in a sub-pipeline with an outer
loop_optimizer_init()/finalize () call or call both themselves.


Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks 
the loop pipeline.


We could use the style used in pass_slp_vectorize::execute:
...
pass_slp_vectorize::execute (function *fun)
{
  basic_block bb;

  bool in_loop_pipeline = scev_initialized_p ();
  if (!in_loop_pipeline)
{
  loop_optimizer_init (LOOPS_NORMAL);
  scev_initialize ();
}

  ...

  if (!in_loop_pipeline)
{
  scev_finalize ();
  loop_optimizer_finalize ();
}
...

Although that doesn't strike me as particularly clean.

Thanks,
- Tom


Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-20 Thread James Greenhalgh
On Fri, Nov 20, 2015 at 09:55:29AM -0600, Evandro Menezes wrote:
> On 11/20/2015 06:27 AM, James Greenhalgh wrote:
> >On Thu, Nov 12, 2015 at 11:32:36AM -0600, Evandro Menezes wrote:
> >>On 11/12/2015 09:39 AM, Evandro Menezes wrote:
> >>2015-11-12  Evandro Menezes 
> >>
> >>[AArch64] Add attribute for compatibility with ARM pipeline models
> >>
> >>gcc/
> >>
> >>* config/aarch64/aarch64.md (predicated): Copy attribute from
> >>"arm.md".
> >>* config/arm/arm.md (predicated): Added description.
> >>
> >>Please, commit if it's alright.
> >The AArch64 part of this is OK.
> >
> >> From 3fa6a2bca8f3d2992b4607cff0afcc2d9caa96f4 Mon Sep 17 00:00:00 2001
> >>From: Evandro Menezes 
> >>Date: Mon, 9 Nov 2015 17:11:16 -0600
> >>Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with ARM
> >>  pipeline models
> >>
> >>gcc/
> >>* config/aarch64/aarch64.md (predicated): Copy attribute from "arm.md".
> >>* config/arm/arm.md (predicated): Added description.
> >>---
> >>  gcc/config/aarch64/aarch64.md | 4 
> >>  gcc/config/arm/arm.md | 3 +++
> >>  2 files changed, 7 insertions(+)
> >>
> >>diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> >>index 1586256..d46f837 100644
> >>--- a/gcc/config/aarch64/aarch64.md
> >>+++ b/gcc/config/aarch64/aarch64.md
> >>@@ -195,6 +195,10 @@
> >>  ;; 1 :=: yes
> >>  (define_attr "far_branch" "" (const_int 0))
> >>+;; Strictly for compatibility with AArch32 in pipeline models, since 
> >>AArch64 has
> >>+;; no predicated insns.
> >>+(define_attr "predicated" "yes,no" (const_string "no"))
> >>+
> >>  ;; ---
> >>  ;; Pipeline descriptions and scheduling
> >>  ;; ---
> >>diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> >>index 73c3088..6bda491 100644
> >>--- a/gcc/config/arm/arm.md
> >>+++ b/gcc/config/arm/arm.md
> >>@@ -105,6 +105,9 @@
> >>  (define_attr "fpu" "none,vfp"
> >>(const (symbol_ref "arm_fpu_attr")))
> >>+; Predicated means that the insn form is conditionally executed based on a
> >>+; predicate.  We default to 'no' because no Thumb patterns match this rule
> >>+; and not all ARM insns do.
> >s/is conditionally executed/can be conditionally executed/ in the first
> >sentence. Otherwise, this looks OK to me but I can't approve the ARM part,
> >so you'll need to wait for a review from someone who can.
> >
> >>  (define_attr "predicated" "yes,no" (const_string "no"))
> >>  ; LENGTH of an instruction (in bytes)
> >>-- 
> >>2.1.0.243.g30d45f7
> 
> Actually, that would then be the existing description for the
> "predicable" attribute.  I think that this proposed description is
> correct for the "predicated" attribute.

Right, understood, that will teach me to pattern match up to pred and stop
reading the word.

This is fine, I've committed it on your behalf as r230666

Thanks,
James



Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Martin Sebor

Hmm, it looks like using expansion_point_if_in_system_header might avoid
the first issue you mention.


Thus.


Great, thanks! (I'll have to remember the trick for my own use!)

Martin


[PATCH, PR68460] Always call free_stmt_vec_info_vec in gather_scalar_reductions

2015-11-20 Thread Tom de Vries

[ was: Re: [PATCH] Fix parloops gimple_uid usage ]

On 09/10/15 23:09, Tom de Vries wrote:

@@ -2392,6 +2397,9 @@ gather_scalar_reductions (loop_p loop, 
reduction_info_table_type *reduction_list
loop_vec_info simple_inner_loop_info = NULL;
bool allow_double_reduc = true;

+  if (!stmt_vec_info_vec.exists ())
+init_stmt_vec_info_vec ();
+
simple_loop_info = vect_analyze_loop_form (loop);
if (simple_loop_info == NULL)
  return;
@@ -2453,9 +2461,16 @@ gather_scalar_reductions (loop_p loop, 
reduction_info_table_type *reduction_list
destroy_loop_vec_info (simple_loop_info, true);
destroy_loop_vec_info (simple_inner_loop_info, true);

+  /* Release the claim on gimple_uid.  */
+  free_stmt_vec_info_vec ();
+


With the src/libgomp/testsuite/libgomp.c/pr46886.c testcase, compiled in 
addition with -ftree-vectorize, I ran into an ICE:

...
src/libgomp/testsuite/libgomp.c/pr46886.c:8:5: internal compiler error: 
in init_stmt_vec_info_vec, at tree-vect-stmts.c:8250

 int foo (void)
 ^~~

0x1196082 init_stmt_vec_info_vec()
src/gcc/tree-vect-stmts.c:8250
0x11c3ed4 vectorize_loops()
src/gcc/tree-vectorizer.c:510
0x10a7ea5 execute
src/gcc/tree-ssa-loop.c:276
...

The ICE is caused by the fact that init_stmt_vec_info_vec is called at 
the start of vectorize_loops, while stmt_vec_info_vec is not empty. I 
traced this back to gather_scalar_reduction, where we call 
init_stmt_vec_info_vec, but we skip free_stmt_vec_info_vec if we take 
the early-out for simple_loop_info == NULL.


This patch fixes the ICE by making sure we always call 
free_stmt_vec_info_vec in gather_scalar_reduction.


Passes cc1/f951 rebuild and autopar testing.

OK for stage3 trunk if bootstrap and regtest succeeds?

Thanks,
- Tom
Always call free_stmt_vec_info_vec in gather_scalar_reductions

2015-11-20  Tom de Vries  

	PR tree-optimization/68460
	* tree-parloops.c (gather_scalar_reductions): Also call
	free_stmt_vec_info_vec if simple_loop_info == NULL.

	* gcc.dg/autopar/pr68460.c: New test.

---
 gcc/testsuite/gcc.dg/autopar/pr68460.c | 35 ++
 gcc/tree-parloops.c|  6 +-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/autopar/pr68460.c b/gcc/testsuite/gcc.dg/autopar/pr68460.c
new file mode 100644
index 000..0c00065
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr68460.c
@@ -0,0 +1,35 @@
+/* { dg-do "compile" } */
+/* { dg-options "-O -ftree-parallelize-loops=2 -ftree-vectorize -fno-tree-ch -fno-tree-dominator-opts" } */
+
+void abort (void);
+
+int d[1024], e[1024];
+
+int
+foo (void)
+{
+  int s = 0;
+  int i;
+
+  for (i = 0; i < 1024; i++)
+s += d[i] - e[i];
+
+  return s;
+}
+
+int
+main ()
+{
+  int i;
+
+  for (i = 0; i < 1024; i++)
+{
+  d[i] = i * 2;
+  e[i] = i;
+}
+
+  if (foo () != 1023 * 1024 / 2)
+abort ();
+
+  return 0;
+}
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 8d7912d..85cc78d 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2433,7 +2433,7 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list
 
   simple_loop_info = vect_analyze_loop_form (loop);
   if (simple_loop_info == NULL)
-return;
+goto gather_done;
 
   for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next (&gsi))
 {
@@ -2492,9 +2492,13 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list
   destroy_loop_vec_info (simple_loop_info, true);
   destroy_loop_vec_info (simple_inner_loop_info, true);
 
+ gather_done:
   /* Release the claim on gimple_uid.  */
   free_stmt_vec_info_vec ();
 
+  if (reduction_list->elements () == 0)
+return;
+
   /* As gimple_uid is used by the vectorizer in between vect_analyze_loop_form
  and free_stmt_vec_info_vec, we can set gimple_uid of reduc_phi stmts only
  now.  */


Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-20 Thread Evandro Menezes

On 11/20/2015 08:34 AM, Kyrill Tkachov wrote:


On 20/11/15 12:27, James Greenhalgh wrote:

On Thu, Nov 12, 2015 at 11:32:36AM -0600, Evandro Menezes wrote:

On 11/12/2015 09:39 AM, Evandro Menezes wrote:
2015-11-12  Evandro Menezes 

[AArch64] Add attribute for compatibility with ARM pipeline models

gcc/

* config/aarch64/aarch64.md (predicated): Copy attribute from
"arm.md".
* config/arm/arm.md (predicated): Added description.


The arm part is ok too. It's just a comment.
In the ChangeLog entry for arm.md I'd say "Add description."

Thanks,
Kyrill


Please, commit if it's alright.

The AArch64 part of this is OK.


 From 3fa6a2bca8f3d2992b4607cff0afcc2d9caa96f4 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 9 Nov 2015 17:11:16 -0600
Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with ARM
  pipeline models

gcc/
* config/aarch64/aarch64.md (predicated): Copy attribute from 
"arm.md".

* config/arm/arm.md (predicated): Added description.
---
  gcc/config/aarch64/aarch64.md | 4 
  gcc/config/arm/arm.md | 3 +++
  2 files changed, 7 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md 
b/gcc/config/aarch64/aarch64.md

index 1586256..d46f837 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -195,6 +195,10 @@
  ;; 1 :=: yes
  (define_attr "far_branch" "" (const_int 0))
  +;; Strictly for compatibility with AArch32 in pipeline models, 
since AArch64 has

+;; no predicated insns.
+(define_attr "predicated" "yes,no" (const_string "no"))
+
  ;; 
---

  ;; Pipeline descriptions and scheduling
  ;; 
---

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 73c3088..6bda491 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -105,6 +105,9 @@
  (define_attr "fpu" "none,vfp"
(const (symbol_ref "arm_fpu_attr")))
  +; Predicated means that the insn form is conditionally executed 
based on a
+; predicate.  We default to 'no' because no Thumb patterns match 
this rule

+; and not all ARM insns do.

s/is conditionally executed/can be conditionally executed/ in the first
sentence. Otherwise, this looks OK to me but I can't approve the ARM 
part,

so you'll need to wait for a review from someone who can.

Thanks,
James


  (define_attr "predicated" "yes,no" (const_string "no"))
; LENGTH of an instruction (in bytes)
--
2.1.0.243.g30d45f7


Can you please commit it with this change in the Changelog?

Thank you,

--
Evandro Menezes



Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-20 Thread Evandro Menezes

On 11/20/2015 06:27 AM, James Greenhalgh wrote:

On Thu, Nov 12, 2015 at 11:32:36AM -0600, Evandro Menezes wrote:

On 11/12/2015 09:39 AM, Evandro Menezes wrote:
2015-11-12  Evandro Menezes 

[AArch64] Add attribute for compatibility with ARM pipeline models

gcc/

* config/aarch64/aarch64.md (predicated): Copy attribute from
"arm.md".
* config/arm/arm.md (predicated): Added description.

Please, commit if it's alright.

The AArch64 part of this is OK.


 From 3fa6a2bca8f3d2992b4607cff0afcc2d9caa96f4 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 9 Nov 2015 17:11:16 -0600
Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with ARM
  pipeline models

gcc/
* config/aarch64/aarch64.md (predicated): Copy attribute from "arm.md".
* config/arm/arm.md (predicated): Added description.
---
  gcc/config/aarch64/aarch64.md | 4 
  gcc/config/arm/arm.md | 3 +++
  2 files changed, 7 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1586256..d46f837 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -195,6 +195,10 @@
  ;; 1 :=: yes
  (define_attr "far_branch" "" (const_int 0))
  
+;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 has

+;; no predicated insns.
+(define_attr "predicated" "yes,no" (const_string "no"))
+
  ;; ---
  ;; Pipeline descriptions and scheduling
  ;; ---
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 73c3088..6bda491 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -105,6 +105,9 @@
  (define_attr "fpu" "none,vfp"
(const (symbol_ref "arm_fpu_attr")))
  
+; Predicated means that the insn form is conditionally executed based on a

+; predicate.  We default to 'no' because no Thumb patterns match this rule
+; and not all ARM insns do.

s/is conditionally executed/can be conditionally executed/ in the first
sentence. Otherwise, this looks OK to me but I can't approve the ARM part,
so you'll need to wait for a review from someone who can.


  (define_attr "predicated" "yes,no" (const_string "no"))
  
  ; LENGTH of an instruction (in bytes)

--
2.1.0.243.g30d45f7


Actually, that would then be the existing description for the 
"predicable" attribute.  I think that this proposed description is 
correct for the "predicated" attribute.


Thank you,

--
Evandro Menezes



Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-20 Thread Jason Merrill

On 11/19/2015 05:16 PM, Jason Merrill wrote:

On 11/19/2015 02:44 PM, Martin Sebor wrote:

On 11/18/2015 09:26 PM, Jason Merrill wrote:

The rs6000 target was hitting a bootstrap failure due to
-Werror=type-limits.  Since warn_tautological_cmp and other warnings
avoid warning if one of the operands comes from a macro, I thought it
would make sense to do that here as well.


The also disables the warning for functions that are shadowed by
macros such as C atomic_load et al. For example, in the program
below. Is that an acceptable compromise or is there a way to avoid
this?


I think it's an acceptable compromise, but see below.


At the same time, the change doesn't suppress the warning in other
cases where I would have expected it to suppress it based on your
description. For instance here:

   unsigned short bar (unsigned short x)
   {
   #define X x

 if (x > 0x)


Yes, this is missed because the front end doesn't remember the location
of the use of x; that's one of many location tracking issues.  David
Malcolm is working on this stuff.


I noticed there is code elsewhere in c-common.c that avoids
issuing the same warning for system headers (that's the code
that responsible for issuing the warning for the second test
case above).


Hmm, it looks like using expansion_point_if_in_system_header might avoid
the first issue you mention.


Thus.


commit fb71bf4de520cc4bd11eefb57e50748b4679996f
Author: Jason Merrill 
Date:   Thu Nov 19 15:21:47 2015 -0500

	* c-common.c (shorten_compare): But look through macros from
	system headers.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 068a0bc..fe0a235 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -4651,8 +4651,10 @@ shorten_compare (location_t loc, tree *op0_ptr, tree *op1_ptr,
 	}
 
   if (TREE_CODE (primop0) != INTEGER_CST
-	  /* Don't warn if it's from a macro.  */
-	  && !from_macro_expansion_at (EXPR_LOCATION (primop0)))
+	  /* Don't warn if it's from a (non-system) macro.  */
+	  && !(from_macro_expansion_at
+	   (expansion_point_location_if_in_system_header
+		(EXPR_LOCATION (primop0)
 	{
 	  if (val == truthvalue_false_node)
 	warning_at (loc, OPT_Wtype_limits,
diff --git a/gcc/testsuite/gcc.dg/Wtype-limits2.c b/gcc/testsuite/gcc.dg/Wtype-limits2.c
new file mode 100644
index 000..92151aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wtype-limits2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-Wtype-limits" } */
+/* { dg-require-effective-target sync_char_short } */
+
+#include 
+
+unsigned foo (unsigned char *x)
+{
+  if (atomic_load (x) > 1000) /* { dg-warning "comparison is always false due to limited range of data type" } */
+return 0;
+  return 1;
+}


Re: [PATCH][ARM] Do not expand movmisalign pattern if not in 32-bit mode

2015-11-20 Thread Ramana Radhakrishnan
On 11/11/15 16:10, Kyrill Tkachov wrote:
> Hi all,
> 
> The attached testcase ICEs when compiled with -march=armv6k -mthumb -Os or 
> any march
> for which -mthumb gives Thumb1:
>  error: unrecognizable insn:
>  }
>  ^
> (insn 13 12 14 5 (set (reg:SI 116 [ x ])
> (unspec:SI [
> (mem:SI (reg/v/f:SI 112 [ s ]) [0 MEM[(unsigned char 
> *)s_1(D)]+0 S4 A8])
> ] UNSPEC_UNALIGNED_LOAD)) besttry.c:9 -1
>  (nil))
> 
> The problem is that the expands a movmisalign pattern but the resulting 
> unaligned loads don't
> match any define_insn because they are gated on unaligned_access && 
> TARGET_32BIT.
> The unaligned_access expander is gated only on unaligned_access.
> 
> This small patch fixes the issue by turning off unaligned_access if 
> TARGET_32BIT is not true.
> We can then remove TARGET_32BIT from the unaligned load/store patterns 
> conditions as a cleanup.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> 
> Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> 2015-11-11  Kyrylo Tkachov  
> 
> * config/arm/arm.c (arm_option_override): Require TARGET_32BIT
> for unaligned_access.
> * config/arm/arm.md (unaligned_loadsi): Remove redundant TARGET_32BIT
> from matching condition.
> (unaligned_loadhis): Likewise.
> (unaligned_loadhiu): Likewise.
> (unaligned_storesi): Likewise.
> (unaligned_storehi): Likewise.
> 
> 2015-11-11  Kyrylo Tkachov  
> 
> * gcc.target/arm/armv6-unaligned-load-ice.c: New test.


This means we don't have unaligned access for some cores in Thumb1 for armv6. 
I'd rather not have the ICE instead of trying to find testing coverage on such 
cores now.

OK.


regards
Ramana


Re: [PATCH][ARM] PR 68149 Fix ICE in unaligned_loaddi split

2015-11-20 Thread Ramana Radhakrishnan
On 10/11/15 17:32, Kyrill Tkachov wrote:
> Hi all,
> 
> This ICE in this PR occurs when we're trying to split unaligned_loaddi into 
> two SImode unaligned loads.
> The problem is in the addressing mode.  When reload was picking the 
> addressing mode we accepted an offset of
> -256 because the mode in the pattern is advertised as DImode and that was 
> accepted by the legitimate address
> hooks because they thought it was a NEON load (DImode is in 
> VALID_NEON_DREG_MODE). However, the splitter wants
> to generate two normal SImode unaligned loads using that address, for which 
> -256 is not valid, so we ICE
> in gen_lowpart.
> 
> The only way unaligned_loaddi could be generated was through the 
> gen_movmem_ldrd_strd expansion that implements
> a memmove using LDRD and STRD sequences. If the memmove source is not aligned 
> we can't use LDRDs so the code
> generates unaligned_loaddi patterns and expects them to be split into two 
> normal loads after reload. Similarly
> for unaligned store destinations.
> 
> This patch just explicitly generates the two unaligned SImode loads or stores 
> when appropriate inside
> gen_movmem_ldrd_strd.  This makes the unaligned_loaddi and unaligned_storedi 
> patterns unused, so we can remove them.
> 
> This patch fixes the ICe in gcc.target/aarch64/advsimd-intrinsics/vldX.c seen 
> with
> -mthumb -mcpu=cortex-a15 -mfpu=neon-vfpv4 -mfloat-abi=hard -mfp16-format=ieee
> so no new testcase is added.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> 
> Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> 2015-11-10  Kyrylo Tkachov  
> 
> PR target/68149
> * config/arm/arm.md (unaligned_loaddi): Delete.
> (unaligned_storedi): Likewise.
> * config/arm/arm.c (gen_movmem_ldrd_strd): Don't generate
> unaligned DImode memory ops.  Instead perform two back-to-back
> unalgined SImode ops.


s/unalgined/unaligned.


Ok.


Ramana


Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-20 Thread Ilya Enkovich
On 20 Nov 14:31, Ilya Enkovich wrote:
> 2015-11-20 14:28 GMT+03:00 Richard Biener :
> > On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich  
> > wrote:
> >> 2015-11-18 16:44 GMT+03:00 Richard Biener :
> >>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich  
> >>> wrote:
>  Hi,
> 
>  When we compute vectypes we skip non-relevant phi nodes.  But we process 
>  non-relevant alive statements and thus may need vectype of non-relevant 
>  live phi node to compute mask vectype.  This patch enables vectype 
>  computation for live phi nodes.  Botostrapped and regtested on 
>  x86_64-unknown-linux-gnu.  OK for trunk?
> >>>
> >>> Hmm.  What breaks if you instead skip all !relevant stmts and not
> >>> compute vectype for life but not relevant ones?  We won't ever
> >>> "vectorize" !relevant ones, that is, we don't need their vector type.
> >>
> >> I tried it and got regression in SLP.  It expected non-null vectype
> >> for non-releveant but live statement. Regression was in
> >> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90
> >
> > Because somebody put a vector type check before
> >
> >   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > return false;
> >
> > @@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
> >tree mask_type;
> >tree mask;
> >
> > +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > +return false;
> > +
> >if (!VECTOR_BOOLEAN_TYPE_P (vectype))
> >  return false;
> >
> > @@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
> >  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
> >
> >gcc_assert (ncopies >= 1);
> > -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > -return false;
> >
> >if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
> >&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
> >
> > fixes this particular fallout for me.
> 
> I'll try it.

With this fix it works fine, thanks!  Bootstrapped and regtested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Ilya
--
gcc/

2015-11-20  Ilya Enkovich  
Richard Biener  

* tree-vect-loop.c (vect_determine_vectorization_factor): Don't
compute vectype for non-relevant mask producers.
* gcc/tree-vect-stmts.c (vectorizable_comparison): Check stmt
relevance earlier.

gcc/testsuite/

2015-11-20  Ilya Enkovich  

* gcc.dg/pr68327.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr68327.c b/gcc/testsuite/gcc.dg/pr68327.c
new file mode 100644
index 000..c3e6a94
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68327.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, d;
+char b, c;
+
+void
+fn1 ()
+{
+  int i = 0;
+  for (; i < 1; i++)
+d = 1;
+  for (; b; b++)
+a = 1 && (d & b);
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 80937ec..592372d 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -439,7 +439,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 compute a factor.  */
  if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
{
- mask_producers.safe_push (stmt_info);
+ if (STMT_VINFO_RELEVANT_P (stmt_info))
+   mask_producers.safe_push (stmt_info);
  bool_result = true;
 
  if (gimple_code (stmt) == GIMPLE_ASSIGN
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 0f64aaf..3723b26 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7546,6 +7546,9 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
   tree mask_type;
   tree mask;
 
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
   if (!VECTOR_BOOLEAN_TYPE_P (vectype))
 return false;
 
@@ -7558,9 +7561,6 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
 ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
 
   gcc_assert (ncopies >= 1);
-  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
-return false;
-
   if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
   && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
   && reduc_def))


[Patch, fortran] PRs 68237 & 66762 - submodule problems

2015-11-20 Thread Paul Richard Thomas
Dear All,

I have committed as 'obvious'  revision 230661 to fix 2/3 submodule
problems. In the case of the third, PR68243, I believe gfortran is
behaving correctly and I am awaiting confirmation from the reporter.

Thanks to Dominique for regtesting the part of the patch that fixes PR66762.

I have bootstrapped and regtested both part on FC21/x86_64.

Cheers

Paul

PS I have just noticed that I attributed the wrong part of the patch
to Steve. I will correct the ChangeLog in just a moment.


2015-11-20  Paul Thomas  

PR fortran/68237
* decl.c (gfc_match_submod_proc): Test the interface symbol
before accessing its attributes.

2015-11-20  Steven G. Kargl  

PR fortran/66762
(gfc_get_symbol_decl): Test for attr.used_in_submodule as well
as attr.use_assoc (twice).
(gfc_create_module_variable): Ditto.

2015-11-20  Paul Thomas  

PR fortran/68237
* gfortran.dg/submodule_12.f90: New test

PR fortran/66762
* gfortran.dg/submodule_6.f90: Add compile option -flto.


Re: [Committed] S/390: Add bswaphi2 pattern

2015-11-20 Thread Andreas Krebbel
On 11/20/2015 01:23 PM, Richard Henderson wrote:
> On 11/20/2015 12:52 PM, Andreas Krebbel wrote:
>> +(define_insn "bswaphi2"
>> +  [(set (match_operand:HI 0   "register_operand" "=d")
>> +(bswap:HI (match_operand:HI 1 "memory_operand"   "RT")))]
>> +  "TARGET_CPU_ZARCH"
>> +  "lrvh\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "op_type" "RXY")
>> +   (set_attr "z10prop" "z10_super")])
> 
> Surely it's better to arrange so that you can use STRVH as well.
> And providing a fallback for the reg-reg case (e.g. LRVR+SRL).
> 
> Although I suppose I don't see support for STRV in bswap32/64 either...

Right, I totally forgot about the stores. I'll have a look.

We even have a mem-mem variant (mvcin). But I found it rather difficult to use 
since the pattern
would have to make sure that source and destination do not overlap.

-Andreas-



Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-20 Thread Kyrill Tkachov


On 20/11/15 12:27, James Greenhalgh wrote:

On Thu, Nov 12, 2015 at 11:32:36AM -0600, Evandro Menezes wrote:

On 11/12/2015 09:39 AM, Evandro Menezes wrote:
2015-11-12  Evandro Menezes 

[AArch64] Add attribute for compatibility with ARM pipeline models

gcc/

* config/aarch64/aarch64.md (predicated): Copy attribute from
"arm.md".
* config/arm/arm.md (predicated): Added description.


The arm part is ok too. It's just a comment.
In the ChangeLog entry for arm.md I'd say "Add description."

Thanks,
Kyrill


Please, commit if it's alright.

The AArch64 part of this is OK.


 From 3fa6a2bca8f3d2992b4607cff0afcc2d9caa96f4 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 9 Nov 2015 17:11:16 -0600
Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with ARM
  pipeline models

gcc/
* config/aarch64/aarch64.md (predicated): Copy attribute from "arm.md".
* config/arm/arm.md (predicated): Added description.
---
  gcc/config/aarch64/aarch64.md | 4 
  gcc/config/arm/arm.md | 3 +++
  2 files changed, 7 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1586256..d46f837 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -195,6 +195,10 @@
  ;; 1 :=: yes
  (define_attr "far_branch" "" (const_int 0))
  
+;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 has

+;; no predicated insns.
+(define_attr "predicated" "yes,no" (const_string "no"))
+
  ;; ---
  ;; Pipeline descriptions and scheduling
  ;; ---
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 73c3088..6bda491 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -105,6 +105,9 @@
  (define_attr "fpu" "none,vfp"
(const (symbol_ref "arm_fpu_attr")))
  
+; Predicated means that the insn form is conditionally executed based on a

+; predicate.  We default to 'no' because no Thumb patterns match this rule
+; and not all ARM insns do.

s/is conditionally executed/can be conditionally executed/ in the first
sentence. Otherwise, this looks OK to me but I can't approve the ARM part,
so you'll need to wait for a review from someone who can.

Thanks,
James


  (define_attr "predicated" "yes,no" (const_string "no"))
  
  ; LENGTH of an instruction (in bytes)

--
2.1.0.243.g30d45f7





[ptx] overrride anchor hook

2015-11-20 Thread Nathan Sidwell
Jim discovered that he needed to override the anchoring hook when using a PPC 
host-side compiler, but didn't figure out why this was needed.  Digging into it, 
I discovered that flag_section_anchors is cleared in toplev.c by the command 
line option machinery, if there are no anchor target hooks.  However, that's 
done in the host compiler and the LTO machinery simply copies the value over 
into the LTO compiler.   Usually that's fine, as the LTO compiler's for the same 
target architecture.  Except when it's an offload compiler.  The flags are not 
resanitized (perhaps that should be done?)


Anyway, that led to the PTX accelerator compiler trying to do section anchory 
things.  Fixed by overriding TARGET_USE_ANCHORS_FOR_SYMBOL_P to say 'no'.


I guess at the next instance of an offload compiler seeing a 'surprising' 
combination of optimization flags, we'll need a resanitize hook?  Jakub?


nathan
2015-11-20  Nathan Sidwell  
	James Norris  

	* config/nvptx/nvptx.c (nvptx_use_anchors_for_symbol_p): New.
	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Override.

Index: config/nvptx/nvptx.c
===
--- config/nvptx/nvptx.c	(revision 230657)
+++ config/nvptx/nvptx.c	(working copy)
@@ -3895,6 +3895,19 @@ nvptx_cannot_copy_insn_p (rtx_insn *insn
   return false;
 }
 }
+
+/* Section anchors do not work.  Initialization for flag_section_anchor
+   probes the existence of the anchoring target hooks and prevents
+   anchoring if they don't exist.  However, we may be being used with
+   a host-side compiler that does support anchoring, and hence see
+   the anchor flag set (as it's not recalculated).  So provide an
+   implementation denying anchoring.  */
+
+static bool
+nvptx_use_anchors_for_symbol_p (const_rtx ARG_UNUSED (a))
+{
+  return false;
+}
 
 /* Record a symbol for mkoffload to enter into the mapping table.  */
 
@@ -4914,6 +4927,9 @@ nvptx_goacc_reduction (gcall *call)
 #undef TARGET_CANNOT_COPY_INSN_P
 #define TARGET_CANNOT_COPY_INSN_P nvptx_cannot_copy_insn_p
 
+#undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
+#define TARGET_USE_ANCHORS_FOR_SYMBOL_P nvptx_use_anchors_for_symbol_p
+
 #undef TARGET_INIT_BUILTINS
 #define TARGET_INIT_BUILTINS nvptx_init_builtins
 #undef TARGET_EXPAND_BUILTIN


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-20 Thread Ilya Enkovich
On 20 Nov 14:54, Richard Biener wrote:
> On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich  wrote:
> > On 19 Nov 18:19, Richard Biener wrote:
> >> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
> >>  wrote:
> >> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> >> It causes two problems when used with Pointer Bounds Checker.
> >> >> The first problem is that we may copy pointers as integer data
> >> >> and thus loose bounds.  The second problem is that if we inline
> >> >> memcpy, we also have to inline bounds copy and this may result
> >> >> in a huge amount of code and significant compilation time growth.
> >> >> This patch disables folding for functions we want to instrument.
> >> >>
> >> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> >> and regtested on x86_64-unknown-linux-gnu.
> >> >
> >> >Can't see anything wrong with it. Ok.
> >>
> >> But for small sizes this can have a huge impact on optimization.  Which is 
> >> why we have the code in the first place.  I'd make the check less broad, 
> >> for example inlining copies of size less than a pointer shouldn't be 
> >> affected.
> >
> > Right.  We also may inline in case we know no pointers are copied.  Below 
> > is a version with extended condition and a couple more tests.  Bootstrapped 
> > and regtested on x86_64-unknown-linux-gnu.  Does it OK for trunk and 
> > gcc-5-branch?
> >
> >>
> >> Richard.
> >>
> >> >
> >> >Bernd
> >>
> >>
> >
> > Thanks,
> > Ilya
> > --
> > gcc/
> >
> > 2015-11-20  Ilya Enkovich  
> >
> > * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
> > fold call if we are going to instrument it and it may
> > copy pointers.
> >
> > gcc/testsuite/
> >
> > 2015-11-20  Ilya Enkovich  
> >
> > * gcc.target/i386/mpx/pr68337-1.c: New test.
> > * gcc.target/i386/mpx/pr68337-2.c: New test.
> > * gcc.target/i386/mpx/pr68337-3.c: New test.
> >
> >
> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> > index 1ab20d1..dd9f80b 100644
> > --- a/gcc/gimple-fold.c
> > +++ b/gcc/gimple-fold.c
> > @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gomp-constants.h"
> >  #include "optabs-query.h"
> >  #include "omp-low.h"
> > +#include "tree-chkp.h"
> > +#include "ipa-chkp.h"
> >
> >
> >  /* Return true when DECL can be referenced from current unit.
> > @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> > *gsi,
> >unsigned int src_align, dest_align;
> >tree off0;
> >
> > +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> > +pointers as wide integer) and also may result in huge function
> > +size because of inlined bounds copy.  Thus don't inline for
> > +functions we want to instrument in case pointers are copied.  */
> > +  if (flag_check_pointer_bounds
> > + && chkp_instrumentable_p (cfun->decl)
> > + /* Even if data may contain pointers we can inline if copy
> > +less than a pointer size.  */
> > + && (!tree_fits_uhwi_p (len)
> > + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
> 
> || tree_to_uhwi (len) >= POINTER_SIZE_UNITS
> 
> > + /* Check data type for pointers.  */
> > + && (!TREE_TYPE (src)
> > + || !TREE_TYPE (TREE_TYPE (src))
> > + || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
> > + || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
> 
> I don't think you can in any way rely on the pointer type of the src argument
> as all pointer conversions are useless and memcpy and friends take void *
> anyway.

This check is looking for cases when we have type information indicating
no pointers are copied.  In case of 'void *' we have to assume pointers
are copied and inlining is undesired.  Test pr68337-2.c checks pointer
type allows to enable inlining.  Looks like this check misses
|| !COMPLETE_TYPE_P(TREE_TYPE (TREE_TYPE (src)))?

> 
> Note that you also disable memmove to memcpy simplification with this
> early check.

Doesn't matter for MPX which uses the same implementation for both cases.

> 
> Where is pointer transfer handled for MPX?  I suppose it's not done
> transparently
> for all memory move instructions but explicitely by instrumented block copy
> routines in libmpx?  In which case how does that identify pointers vs.
> non-pointers?

It is handled by instrumentation pass.  Compiler checks type of stored data to
find pointer stores.  Each pointer store is instrumented with bndstx call.

MPX versions of memcpy, memmove etc. don't make any assumptions about
type of copied data and just copy whole chunk of bounds metadata corresponding
to copied block.

Thanks,
Ilya

> 
> Richard.
> 


[Patch] sync top level configure with binutils-gdb

2015-11-20 Thread Tristan Gingold
This patch was pushed on binutils-gdb repo, so I also commit it on gcc.

Tristan.


2015-11-20  Tristan Gingold  

Sync with binutils-gdb:
2015-11-20  Tristan Gingold  

* configure.ac: Add aarch64-*-darwin* and arm-*-darwin*.
* configure: Regenerate.

Index: configure
===
--- configure   (revision 230657)
+++ configure   (working copy)
@@ -3686,6 +3686,14 @@
 case "${target}" in
   *-*-chorusos)
 ;;
+  aarch64-*-darwin*)
+noconfigdirs="$noconfigdirs ld gas gdb gprof"
+noconfigdirs="$noconfigdirs sim target-rda"
+;;
+  arm-*-darwin*)
+noconfigdirs="$noconfigdirs ld gas gdb gprof"
+noconfigdirs="$noconfigdirs sim target-rda"
+;;
   powerpc-*-darwin*)
 noconfigdirs="$noconfigdirs ld gas gdb gprof"
 noconfigdirs="$noconfigdirs sim target-rda"
Index: configure.ac
===
--- configure.ac(revision 230657)
+++ configure.ac(working copy)
@@ -1023,6 +1023,14 @@
 case "${target}" in
   *-*-chorusos)
 ;;
+  aarch64-*-darwin*)
+noconfigdirs="$noconfigdirs ld gas gdb gprof"
+noconfigdirs="$noconfigdirs sim target-rda"
+;;
+  arm-*-darwin*)
+noconfigdirs="$noconfigdirs ld gas gdb gprof"
+noconfigdirs="$noconfigdirs sim target-rda"
+;;
   powerpc-*-darwin*)
 noconfigdirs="$noconfigdirs ld gas gdb gprof"
 noconfigdirs="$noconfigdirs sim target-rda"



Re: [PATCH] PR tree-optimization/68413 : Only check for integer cond reduction on analysis stage

2015-11-20 Thread Alan Hayward


On 20/11/2015 13:47, "Richard Biener"  wrote:

>On Fri, Nov 20, 2015 at 1:33 PM, Alan Hayward 
>wrote:
>>
>>
>>On 20/11/2015 11:00, "Richard Biener"  wrote:
>>
>>>On Fri, Nov 20, 2015 at 10:24 AM, Alan Hayward 
>>>wrote:
When vectorising a integer induction condition reduction,
is_nonwrapping_integer_induction ends up with different values for base
during the analysis and build phases. In the first it is an
INTEGER_CST,
in the second the loop has been vectorised out and the base is now a
variable.

This results in the analysis and build stage detecting the
STMT_VINFO_VEC_REDUCTION_TYPE as different types.

The easiest way to fix this is to only check for integer induction
conditions on the analysis stage.
>>>
>>>I don't like this.  For the evolution part we have added
>>>STMT_VINFO_LOOP_PHI_EVOLUTION_PART.  If you now need
>>>the original initial value as well then just save it.
>>>
>>>Or if you really want to go with the hack then please do not call
>>>is_nonwrapping_integer_induction with vec_stmt != NULL but
>>>initialize cond_expr_is_nonwrapping_integer_induction from
>>>STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
>>>
>>>The hack also lacks a comment.
>>>
>>
>>Ok. I've gone for a combination of both:
>>
>>I now cache the base in STMT_VINFO_LOOP_PHI_EVOLUTION_BASE.
>>
>>I've removed the vec_stmt != NULL checks.
>>
>>I've moved the call to is_nonwrapping_integer_induction until after the
>>vect_is_simple_reduction check. I never liked that I had
>>is_nonwrapping_integer_induction early in the function, and think this
>>looks better.
>
>It looks better but the comment for loop_phi_evolution_base is wrong.
>The value is _not_ "correct" after prologue peeling (unless you
>update it there to a non-constant expr).  It is conservatively "correct"
>for is_nonwrapping_integer_induction though.  Which is why I'd
>probably rename it to STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED
>to reflect this (and similar to STMT_VINFO_NITERS_UNCHANGED).
>
>Ok with that change.

Updated as requested and submitted.

Alan.



analysisonlycondcheck3.patch
Description: Binary data


Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-20 Thread Bernd Schmidt

On 11/19/2015 12:49 AM, Jeff Law wrote:

On 11/18/2015 12:16 PM, Bernd Schmidt wrote:

I don't think so, actually. One safe option would be to rip it out and
just stop transforming this case, but let's start by looking at the code
just a bit further down, calling noce_can_store_speculate. This was
added later than the code we're discussing, and it tries to verify that
the same memory location will unconditionally be written to at a point
later than the one we're trying to convert

And if we dig into that thread, Ian suggests this isn't terribly
important anyway.

However, if we go even further upthread, we find an assertion from
Michael that this is critical for 456.hmmer and references BZ 27313.


Cc'd in case he has additional input.


https://gcc.gnu.org/ml/gcc-patches/2007-08/msg01978.html

Sadly, no testcase was included.


BZ27313 is marked as fixed by the introduction of the tree cselim pass, 
thus the problem won't even be seen at RTL level.
I'm undecided on whether cs-elim is safe wrt the store speculation vs 
locks concerns raised in the thread discussing Ian's 
noce_can_store_speculate_p, but that's not something we have to consider 
to solve the problem at hand.



So if it weren't for the assertion that it's critical for hmmr, I'd be
convinced that just ripping out was the right thing to do.

Can you look at 27313 and hmmr and see if there's an impact.  Just maybe
the critical stuff for those is handled by the tree if converter and we
can just rip out the clearly incorrect RTL bits without regressing
anything performance-wise.  If there is an impact, then I think we have
to look at either improving the tree bits (so we can remove the rtl
bits) or we have to do real dataflow analysis in the rtl bits.


So I made this change:

   if (!set_b && MEM_P (orig_x))
-{
-  /* Disallow the "if (...) x = a;" form (implicit "else x = x;")
-for optimizations if writing to x may trap or fault,
-i.e. it's a memory other than a static var or a stack slot,
-is misaligned on strict aligned machines or is read-only.  If
-x is a read-only memory, then the program is valid only if we
-avoid the store into it.  If there are stores on both the
-THEN and ELSE arms, then we can go ahead with the conversion;
-either the program is broken, or the condition is always
-false such that the other memory is selected.  */
-  if (noce_mem_write_may_trap_or_fault_p (orig_x))
-   return FALSE;
-
-  /* Avoid store speculation: given "if (...) x = a" where x is a
-MEM, we only want to do the store if x is always set
-somewhere in the function.  This avoids cases like
-  if (pthread_mutex_trylock(mutex))
-++global_variable;
-where we only want global_variable to be changed if the mutex
-is held.  FIXME: This should ideally be expressed directly in
-RTL somehow.  */
-  if (!noce_can_store_speculate_p (test_bb, orig_x))
-   return FALSE;
-}
+return FALSE;

As far as I can tell hmmer and the 27313 testcase are unaffected at -O2 
(if anything, hmmer was very slightly faster afterwards). The run wasn't 
super-scientific, but I wouldn't have expected anything else given the 
existence of cs-elim.


Ok to do the above, removing all the bits made unnecessary (including 
memory_must_be_modified_in_insn_p in alias.c)?



Bernd


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-20 Thread Richard Biener
On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich  wrote:
> On 19 Nov 18:19, Richard Biener wrote:
>> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
>>  wrote:
>> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
>> >> Currently we fold all memcpy/memmove calls with a known data size.
>> >> It causes two problems when used with Pointer Bounds Checker.
>> >> The first problem is that we may copy pointers as integer data
>> >> and thus loose bounds.  The second problem is that if we inline
>> >> memcpy, we also have to inline bounds copy and this may result
>> >> in a huge amount of code and significant compilation time growth.
>> >> This patch disables folding for functions we want to instrument.
>> >>
>> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
>> >> and regtested on x86_64-unknown-linux-gnu.
>> >
>> >Can't see anything wrong with it. Ok.
>>
>> But for small sizes this can have a huge impact on optimization.  Which is 
>> why we have the code in the first place.  I'd make the check less broad, for 
>> example inlining copies of size less than a pointer shouldn't be affected.
>
> Right.  We also may inline in case we know no pointers are copied.  Below is 
> a version with extended condition and a couple more tests.  Bootstrapped and 
> regtested on x86_64-unknown-linux-gnu.  Does it OK for trunk and gcc-5-branch?
>
>>
>> Richard.
>>
>> >
>> >Bernd
>>
>>
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-11-20  Ilya Enkovich  
>
> * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
> fold call if we are going to instrument it and it may
> copy pointers.
>
> gcc/testsuite/
>
> 2015-11-20  Ilya Enkovich  
>
> * gcc.target/i386/mpx/pr68337-1.c: New test.
> * gcc.target/i386/mpx/pr68337-2.c: New test.
> * gcc.target/i386/mpx/pr68337-3.c: New test.
>
>
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index 1ab20d1..dd9f80b 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gomp-constants.h"
>  #include "optabs-query.h"
>  #include "omp-low.h"
> +#include "tree-chkp.h"
> +#include "ipa-chkp.h"
>
>
>  /* Return true when DECL can be referenced from current unit.
> @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
>unsigned int src_align, dest_align;
>tree off0;
>
> +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> +pointers as wide integer) and also may result in huge function
> +size because of inlined bounds copy.  Thus don't inline for
> +functions we want to instrument in case pointers are copied.  */
> +  if (flag_check_pointer_bounds
> + && chkp_instrumentable_p (cfun->decl)
> + /* Even if data may contain pointers we can inline if copy
> +less than a pointer size.  */
> + && (!tree_fits_uhwi_p (len)
> + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)

|| tree_to_uhwi (len) >= POINTER_SIZE_UNITS

> + /* Check data type for pointers.  */
> + && (!TREE_TYPE (src)
> + || !TREE_TYPE (TREE_TYPE (src))
> + || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
> + || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)

I don't think you can in any way rely on the pointer type of the src argument
as all pointer conversions are useless and memcpy and friends take void *
anyway.

Note that you also disable memmove to memcpy simplification with this
early check.

Where is pointer transfer handled for MPX?  I suppose it's not done
transparently
for all memory move instructions but explicitely by instrumented block copy
routines in libmpx?  In which case how does that identify pointers vs.
non-pointers?

Richard.

> +   return false;
> +
>/* Build accesses at offset zero with a ref-all character type.  */
>off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
>  ptr_mode, true), 0);
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c 
> b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
> new file mode 100644
> index 000..3f8d79d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +#include "mpx-check.h"
> +
> +#define N 2
> +
> +extern void abort ();
> +
> +static int
> +mpx_test (int argc, const char **argv)
> +{
> +  char ** src = (char **)malloc (sizeof (char *) * N);
> +  char ** dst = (char **)malloc (sizeof (char *) * N);
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
> +
> +  __builtin_memcpy(dst, src, sizeof (char *) * N);
> +
> +  for (i = 0; i < N; i++)
> +{
> +  char *p = dst[i];
> +  if (p != argv[0] + i
> + || __bnd_get_ptr_lbound (p) != p
> + || __bnd_

Re: [PATCH] PR tree-optimization/68413 : Only check for integer cond reduction on analysis stage

2015-11-20 Thread Richard Biener
On Fri, Nov 20, 2015 at 1:33 PM, Alan Hayward  wrote:
>
>
> On 20/11/2015 11:00, "Richard Biener"  wrote:
>
>>On Fri, Nov 20, 2015 at 10:24 AM, Alan Hayward 
>>wrote:
>>> When vectorising a integer induction condition reduction,
>>> is_nonwrapping_integer_induction ends up with different values for base
>>> during the analysis and build phases. In the first it is an INTEGER_CST,
>>> in the second the loop has been vectorised out and the base is now a
>>> variable.
>>>
>>> This results in the analysis and build stage detecting the
>>> STMT_VINFO_VEC_REDUCTION_TYPE as different types.
>>>
>>> The easiest way to fix this is to only check for integer induction
>>> conditions on the analysis stage.
>>
>>I don't like this.  For the evolution part we have added
>>STMT_VINFO_LOOP_PHI_EVOLUTION_PART.  If you now need
>>the original initial value as well then just save it.
>>
>>Or if you really want to go with the hack then please do not call
>>is_nonwrapping_integer_induction with vec_stmt != NULL but
>>initialize cond_expr_is_nonwrapping_integer_induction from
>>STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
>>
>>The hack also lacks a comment.
>>
>
> Ok. I've gone for a combination of both:
>
> I now cache the base in STMT_VINFO_LOOP_PHI_EVOLUTION_BASE.
>
> I've removed the vec_stmt != NULL checks.
>
> I've moved the call to is_nonwrapping_integer_induction until after the
> vect_is_simple_reduction check. I never liked that I had
> is_nonwrapping_integer_induction early in the function, and think this
> looks better.

It looks better but the comment for loop_phi_evolution_base is wrong.
The value is _not_ "correct" after prologue peeling (unless you
update it there to a non-constant expr).  It is conservatively "correct"
for is_nonwrapping_integer_induction though.  Which is why I'd
probably rename it to STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED
to reflect this (and similar to STMT_VINFO_NITERS_UNCHANGED).

Ok with that change.

Thanks,
Richard.

>
> Alan.
>


Re: [PATCH, VECTOR ABI] Add __attribute__((__simd__)) to GCC.

2015-11-20 Thread Kirill Yukhin
Hello Kyrill,
On 20 Nov 12:15, Kyrill Tkachov wrote:
> >gcc/tessuite/
> > * c-c++-common/attr-simd-3.c: Put xfail (PR68158) on dg-error.
> 
> This test fails on bare-metal targets that don't support -fcilkplus or 
> -pthread.
> Would you consider moving them to the cilkplus testing directory or adding an 
> appropriate
> effective target check?
I think so. I'll commit change in the bottom.

gcc/testsuite/
* c-c++-common/attr-simd-3.c: Require Cilk Plus in effective target.

> 
> Thanks,
> Kyrill

--
Thanks, K

diff --git a/gcc/testsuite/c-c++-common/attr-simd-3.c 
b/gcc/testsuite/c-c++-common/attr-simd-3.c
index 35dd4c0..c7533f0 100644
--- a/gcc/testsuite/c-c++-common/attr-simd-3.c
+++ b/gcc/testsuite/c-c++-common/attr-simd-3.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target cilkplus } */
 /* { dg-do compile } */
 /* { dg-options "-fcilkplus" } */
 /* { dg-prune-output "undeclared here \\(not in a function\\)|\[^\n\r\]* was 
not declared in this scope" } */


Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-20 Thread Richard Biener
On Fri, 20 Nov 2015, Tom de Vries wrote:

> On 20/11/15 11:37, Richard Biener wrote:
> >I'd rather make loop_optimizer_init do nothing
> > if requested flags are already set and no fixup is needed
> 
> > Thus sth like
> > 
> > Index: gcc/loop-init.c
> > ===
> > --- gcc/loop-init.c (revision 230649)
> > +++ gcc/loop-init.c (working copy)
> > @@ -103,7 +103,11 @@ loop_optimizer_init (unsigned flags)
> > calculate_dominance_info (CDI_DOMINATORS);
> > 
> > if (!needs_fixup)
> > -   checking_verify_loop_structure ();
> > +   {
> > + checking_verify_loop_structure ();
> > + if (loops_state_satisfies_p (flags))
> > +   goto out;
> 
> What about flags that are present in the loops state, but not requested in
> flags? Should we try to clear those flags?

No, I don't think so, that would break in-loop-pipeline LIM, dropping
loop-closed SSA for example.

I agree it's somewhat of an odd behavior but all passes should
either be placed in a sub-pipeline with an outer 
loop_optimizer_init()/finalize () call or call both themselves.

Richard.

> Thanks,
> - Tom
> 
> > +   }
> > 
> > /* Clear all flags.  */
> > if (recorded_exits)
> > @@ -122,11 +126,12 @@ loop_optimizer_init (unsigned flags)
> > /* Apply flags to loops.  */
> > apply_loop_flags (flags);
> > 
> > +  checking_verify_loop_structure ();
> > +
> > +out:
> > /* Dump loops.  */
> > flow_loops_dump (dump_file, NULL, 1);
> > 
> > -  checking_verify_loop_structure ();
> > -
> > timevar_pop (TV_LOOP_INIT);
> >   }
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-20 Thread Tom de Vries

On 20/11/15 11:37, Richard Biener wrote:

   I'd rather make loop_optimizer_init do nothing
if requested flags are already set and no fixup is needed



Thus sth like

Index: gcc/loop-init.c
===
--- gcc/loop-init.c (revision 230649)
+++ gcc/loop-init.c (working copy)
@@ -103,7 +103,11 @@ loop_optimizer_init (unsigned flags)
calculate_dominance_info (CDI_DOMINATORS);

if (!needs_fixup)
-   checking_verify_loop_structure ();
+   {
+ checking_verify_loop_structure ();
+ if (loops_state_satisfies_p (flags))
+   goto out;


What about flags that are present in the loops state, but not requested 
in flags? Should we try to clear those flags?


Thanks,
- Tom


+   }

/* Clear all flags.  */
if (recorded_exits)
@@ -122,11 +126,12 @@ loop_optimizer_init (unsigned flags)
/* Apply flags to loops.  */
apply_loop_flags (flags);

+  checking_verify_loop_structure ();
+
+out:
/* Dump loops.  */
flow_loops_dump (dump_file, NULL, 1);

-  checking_verify_loop_structure ();
-
timevar_pop (TV_LOOP_INIT);
  }




Go patch committed: add receiver type to specific type function name

2015-11-20 Thread Ian Lance Taylor
This patch to the Go frontend fixes the case where two different
methods on different types with the same method name both define a
type internally with the same name where the type requires a specific
type hash or equality function.  Before this patch those functions
would get the same, causing a compilation error.  This patch gives
them different names using the same approach as is done for the type
descriptor.  I sent out a test case to the master Go testsuite in
https://golang.org/cl/17081.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline and GCC 5 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230463)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-e3aef41ce0c5be81e2589e60d9cb0db1516e9e2d
+dfa74d975884f363c74d6a66a37b1703093fdba6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 230463)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -1769,7 +1769,16 @@ Type::specific_type_functions(Gogo* gogo
   const Named_object* in_function = name->in_function(&index);
   if (in_function != NULL)
{
- base_name += '$' + Gogo::unpack_hidden_name(in_function->name());
+ base_name.append(1, '$');
+ const Typed_identifier* rcvr =
+   in_function->func_value()->type()->receiver();
+ if (rcvr != NULL)
+   {
+ Named_type* rcvr_type = rcvr->type()->deref()->named_type();
+ base_name.append(Gogo::unpack_hidden_name(rcvr_type->name()));
+ base_name.append(1, '$');
+   }
+ base_name.append(Gogo::unpack_hidden_name(in_function->name()));
  if (index > 0)
{
  char buf[30];


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-20 Thread Ilya Enkovich
On 19 Nov 18:19, Richard Biener wrote:
> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
>  wrote:
> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> It causes two problems when used with Pointer Bounds Checker.
> >> The first problem is that we may copy pointers as integer data
> >> and thus loose bounds.  The second problem is that if we inline
> >> memcpy, we also have to inline bounds copy and this may result
> >> in a huge amount of code and significant compilation time growth.
> >> This patch disables folding for functions we want to instrument.
> >>
> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> and regtested on x86_64-unknown-linux-gnu.
> >
> >Can't see anything wrong with it. Ok.
> 
> But for small sizes this can have a huge impact on optimization.  Which is 
> why we have the code in the first place.  I'd make the check less broad, for 
> example inlining copies of size less than a pointer shouldn't be affected.

Right.  We also may inline in case we know no pointers are copied.  Below is a 
version with extended condition and a couple more tests.  Bootstrapped and 
regtested on x86_64-unknown-linux-gnu.  Does it OK for trunk and gcc-5-branch?

> 
> Richard.
> 
> >
> >Bernd
> 
> 

Thanks,
Ilya
--
gcc/

2015-11-20  Ilya Enkovich  

* gimple-fold.c (gimple_fold_builtin_memory_op): Don't
fold call if we are going to instrument it and it may
copy pointers.

gcc/testsuite/

2015-11-20  Ilya Enkovich  

* gcc.target/i386/mpx/pr68337-1.c: New test.
* gcc.target/i386/mpx/pr68337-2.c: New test.
* gcc.target/i386/mpx/pr68337-3.c: New test.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1ab20d1..dd9f80b 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "optabs-query.h"
 #include "omp-low.h"
+#include "tree-chkp.h"
+#include "ipa-chkp.h"
 
 
 /* Return true when DECL can be referenced from current unit.
@@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   unsigned int src_align, dest_align;
   tree off0;
 
+  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
+pointers as wide integer) and also may result in huge function
+size because of inlined bounds copy.  Thus don't inline for
+functions we want to instrument in case pointers are copied.  */
+  if (flag_check_pointer_bounds
+ && chkp_instrumentable_p (cfun->decl)
+ /* Even if data may contain pointers we can inline if copy
+less than a pointer size.  */
+ && (!tree_fits_uhwi_p (len)
+ || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
+ /* Check data type for pointers.  */
+ && (!TREE_TYPE (src)
+ || !TREE_TYPE (TREE_TYPE (src))
+ || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
+ || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
+   return false;
+
   /* Build accesses at offset zero with a ref-all character type.  */
   off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
 ptr_mode, true), 0);
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
new file mode 100644
index 000..3f8d79d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include "mpx-check.h"
+
+#define N 2
+
+extern void abort ();
+
+static int
+mpx_test (int argc, const char **argv)
+{
+  char ** src = (char **)malloc (sizeof (char *) * N);
+  char ** dst = (char **)malloc (sizeof (char *) * N);
+  int i;
+
+  for (i = 0; i < N; i++)
+src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
+
+  __builtin_memcpy(dst, src, sizeof (char *) * N);
+
+  for (i = 0; i < N; i++)
+{
+  char *p = dst[i];
+  if (p != argv[0] + i
+ || __bnd_get_ptr_lbound (p) != p
+ || __bnd_get_ptr_ubound (p) != p + i)
+   abort ();
+}
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
new file mode 100644
index 000..16736b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+/* { dg-final { scan-assembler-not "memcpy" } } */
+
+void
+test1 (char *dst, char *src)
+{
+  __builtin_memcpy (dst, src, sizeof (char *) * 2);
+}
+
+void
+test2 (void *dst, void *src)
+{
+  __builtin_memcpy (dst, src, sizeof (char *) / 2);
+}
+
+struct s
+{
+  int a;
+  int b;
+};
+
+void
+test3 (struct s *dst, struct s *src)
+{
+  __builtin_memcpy (dst, src, sizeof (struct s));
+}
diff --git a/gcc/testsuite/gc

Re: GCC 5.3 Status Report (2015-11-20)

2015-11-20 Thread David Edelsohn
On Fri, Nov 20, 2015 at 7:53 AM, Richard Biener  wrote:
>
> Status
> ==
>
> We plan to do a GCC 5.3 release candidate at the end of next week
> followed by the actual release a week after that.
>
> So now is the time to look at your regression bugs in bugzilla and
> do some backporting for things already fixed on trunk.

I'm still waiting for approval of the libtool change to support AIX
TLS symbols (PR 68192).  There has been no response on Libtool patches
mailing list.

I again request permission to apply the patches to GCC trunk and 5-branch.

Thanks, David


[committed, trivial] Fix typo and trailing whitespace in dump-file strings in parloops

2015-11-20 Thread Tom de Vries

[ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ]

On 18/11/15 17:22, Bernhard Reutner-Fischer wrote:

Bonus points for fixing the dump_file to parse in:


>Parloops will fail because:
>...
>phi is n_2 = PHI 
>arg of phi to exit: value n_4(D) used outside loop
>checking if it a part of reduction pattern:

s/it a/it is/



This patch fixes a typo and trailing whitespace in dump-file strings in 
parloops.


Build for c and fortran, tested -fdump-tree-parloops testcases.

Committed to trunk as trivial.

Thanks,
- Tom
Fix typo and trailing whitespace in dump-file strings in parloops

2015-11-19  Tom de Vries  

	* tree-parloops.c (build_new_reduction): Fix trailing whitespace in
	dump-file string.
	(try_create_reduction_list): Same.  Fix typo in dump-file string.

---
 gcc/tree-parloops.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 8d7912d..aca2370 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2383,7 +2383,7 @@ build_new_reduction (reduction_info_table_type *reduction_list,
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file,
-	   "Detected reduction. reduction stmt is: \n");
+	   "Detected reduction. reduction stmt is:\n");
   print_gimple_stmt (dump_file, reduc_stmt, 0, 0);
   fprintf (dump_file, "\n");
 }
@@ -2564,7 +2564,7 @@ try_create_reduction_list (loop_p loop,
 	  print_generic_expr (dump_file, val, 0);
 	  fprintf (dump_file, " used outside loop\n");
 	  fprintf (dump_file,
-		   "  checking if it a part of reduction pattern:  \n");
+		   "  checking if it is part of reduction pattern:\n");
 	}
 	  if (reduction_list->elements () == 0)
 	{


GCC 5.3 Status Report (2015-11-20)

2015-11-20 Thread Richard Biener

Status
==

We plan to do a GCC 5.3 release candidate at the end of next week
followed by the actual release a week after that.

So now is the time to look at your regression bugs in bugzilla and
do some backporting for things already fixed on trunk.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2  121+  30
P3   20-   8
P4   87+   2
P5   32+   2
---   ---
Total P1-P3 141+  22
Total   260+  24


Previous Report
===

https://gcc.gnu.org/ml/gcc/2015-07/msg00197.html



Re: [PATCH] PR tree-optimization/68413 : Only check for integer cond reduction on analysis stage

2015-11-20 Thread Alan Hayward


On 20/11/2015 11:00, "Richard Biener"  wrote:

>On Fri, Nov 20, 2015 at 10:24 AM, Alan Hayward 
>wrote:
>> When vectorising a integer induction condition reduction,
>> is_nonwrapping_integer_induction ends up with different values for base
>> during the analysis and build phases. In the first it is an INTEGER_CST,
>> in the second the loop has been vectorised out and the base is now a
>> variable.
>>
>> This results in the analysis and build stage detecting the
>> STMT_VINFO_VEC_REDUCTION_TYPE as different types.
>>
>> The easiest way to fix this is to only check for integer induction
>> conditions on the analysis stage.
>
>I don't like this.  For the evolution part we have added
>STMT_VINFO_LOOP_PHI_EVOLUTION_PART.  If you now need
>the original initial value as well then just save it.
>
>Or if you really want to go with the hack then please do not call
>is_nonwrapping_integer_induction with vec_stmt != NULL but
>initialize cond_expr_is_nonwrapping_integer_induction from
>STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
>
>The hack also lacks a comment.
>

Ok. I've gone for a combination of both:

I now cache the base in STMT_VINFO_LOOP_PHI_EVOLUTION_BASE.

I've removed the vec_stmt != NULL checks.

I've moved the call to is_nonwrapping_integer_induction until after the
vect_is_simple_reduction check. I never liked that I had
is_nonwrapping_integer_induction early in the function, and think this
looks better.


Alan.



analysisonlycondcheck2.patch
Description: Binary data


Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-20 Thread James Greenhalgh
On Thu, Nov 12, 2015 at 11:32:36AM -0600, Evandro Menezes wrote:
> On 11/12/2015 09:39 AM, Evandro Menezes wrote:
>2015-11-12  Evandro Menezes 
> 
>[AArch64] Add attribute for compatibility with ARM pipeline models
> 
>gcc/
> 
>* config/aarch64/aarch64.md (predicated): Copy attribute from
>"arm.md".
>* config/arm/arm.md (predicated): Added description.
> 
> Please, commit if it's alright.

The AArch64 part of this is OK.

> From 3fa6a2bca8f3d2992b4607cff0afcc2d9caa96f4 Mon Sep 17 00:00:00 2001
> From: Evandro Menezes 
> Date: Mon, 9 Nov 2015 17:11:16 -0600
> Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with ARM
>  pipeline models
> 
> gcc/
>   * config/aarch64/aarch64.md (predicated): Copy attribute from "arm.md".
>   * config/arm/arm.md (predicated): Added description.
> ---
>  gcc/config/aarch64/aarch64.md | 4 
>  gcc/config/arm/arm.md | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 1586256..d46f837 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -195,6 +195,10 @@
>  ;; 1 :=: yes
>  (define_attr "far_branch" "" (const_int 0))
>  
> +;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 
> has
> +;; no predicated insns.
> +(define_attr "predicated" "yes,no" (const_string "no"))
> +
>  ;; ---
>  ;; Pipeline descriptions and scheduling
>  ;; ---
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 73c3088..6bda491 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -105,6 +105,9 @@
>  (define_attr "fpu" "none,vfp"
>(const (symbol_ref "arm_fpu_attr")))
>  
> +; Predicated means that the insn form is conditionally executed based on a
> +; predicate.  We default to 'no' because no Thumb patterns match this rule
> +; and not all ARM insns do.

s/is conditionally executed/can be conditionally executed/ in the first
sentence. Otherwise, this looks OK to me but I can't approve the ARM part,
so you'll need to wait for a review from someone who can.

Thanks,
James

>  (define_attr "predicated" "yes,no" (const_string "no"))
>  
>  ; LENGTH of an instruction (in bytes)
> -- 
> 2.1.0.243.g30d45f7
> 



Re: [RFC] [Patch] PR67326 - relax trap assumption by looking at similar DRS

2015-11-20 Thread Jakub Jelinek
On Fri, Nov 20, 2015 at 12:02:07PM +, Kumar, Venkataramanan wrote:
> Also. 
> (1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-common. 
> Sometimes vectorization flags also triggers if conversion.
> (2) Also hashing base DRs for writes only. 

Let me comment just on formatting, will leave actual review to Richard.
> 
> gcc/ChangeLog
> 2015-11-19  Venkataramanan  

Surname missing.
>   
>   PR tree-optimization/67326
>   * tree-if-conv.c  (offset_DR_map): Define.

Extraneous space.

>   (struct ifc_dr): Add new tree base_predicate field.

All ChangeLog lines should be indented by a single tab.

>   (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash offsets, 
> DR pairs

Too long line.

>   and hash base ref,  DR pairs  for write type DRs.

Extraneous spaces.

>   (ifcvt_memrefs_wont_trap):  Guard checks with 
> -ftree-loop-if-convert-stores flag. 

Too long line and extraneous spaces.

>Check for similar DR that are accessed unconditionally.
>(if_convertible_loop_p_1):  Initialize and delete offset hash maps

Extraneous space, missing full stop at the end.
> 
> gcc/testsuite/ChangeLog
> 2015-11-19  Venkataramanan  
>   * gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.

Extraneous space.

> +  if (DR_IS_WRITE (a))
>  {
> -  IFC_DR (a)->predicate = ca;
> -  *base_master_dr = a;
> +  base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);

Missing space before &exist2);

> +  offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
> +  if (!exist3)
> + *offset_master_dr = a;
> +
> +  if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
> + DR_RW_UNCONDITIONALLY (*offset_master_dr)
> + = DR_RW_UNCONDITIONALLY (*master_dr);

Wrong indentation, the = should be below the first underscore in
DR_RW_UNCONDITIONALLY.

> -  if (DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1)
> +  if ((base_master_dr
> +&& DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1))

Extraneous () pair.

> +  else if (DR_OFFSET (a))
> +{
> +  offset_dr = offset_DR_map->get (DR_OFFSET (a));
> +  if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)

Likewise.

Jakub


Re: [Committed] S/390: Add bswaphi2 pattern

2015-11-20 Thread Richard Henderson

On 11/20/2015 12:52 PM, Andreas Krebbel wrote:

+(define_insn "bswaphi2"
+  [(set (match_operand:HI 0   "register_operand" "=d")
+   (bswap:HI (match_operand:HI 1 "memory_operand"   "RT")))]
+  "TARGET_CPU_ZARCH"
+  "lrvh\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "op_type" "RXY")
+   (set_attr "z10prop" "z10_super")])


Surely it's better to arrange so that you can use STRVH as well.
And providing a fallback for the reg-reg case (e.g. LRVR+SRL).

Although I suppose I don't see support for STRV in bswap32/64 either...


r~


Re: [PATCH, VECTOR ABI] Add __attribute__((__simd__)) to GCC.

2015-11-20 Thread Kyrill Tkachov

Hi Kirill,

On 18/11/15 14:11, Kirill Yukhin wrote:

Hello Andreas, Devid.

On 18 Nov 10:45, Andreas Schwab wrote:

Kirill Yukhin  writes:


diff --git a/gcc/testsuite/c-c++-common/attr-simd.c 
b/gcc/testsuite/c-c++-common/attr-simd.c
new file mode 100644
index 000..b4eda34
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-simd.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-optimized" } */
+
+__attribute__((__simd__))
+extern
+int simd_attr (void)
+{
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "simd_attr\[ \\t\]simdclone|vector" "optimized" 
} } */

On ia64:

FAIL: c-c++-common/attr-simd.c  -Wc++-compat   scan-tree-dump optimized "simd_attr[ 
\\t]simdclone|vector"
FAIL: c-c++-common/attr-simd.c  -Wc++-compat   scan-tree-dump optimized "simd_attr2[ 
\\t]simdclone|vector"

$ grep simd_attr attr-simd.c.194t.optimized
;; Function simd_attr (simd_attr, funcdef_no=0, decl_uid=1389, cgraph_uid=0, 
symbol_order=0)
simd_attr ()
;; Function simd_attr2 (simd_attr2, funcdef_no=1, decl_uid=1392, cgraph_uid=1, 
symbol_order=1)
simd_attr2 ()

As far as vABI is supported on x86_64/i?86 only, I am going to enable mentioned 
`scan-tree-dump' only
for these targets. This should cure both IA64 and Power.

Concerning attr-simd-3.c. It is known issue: PR68158.
And I believe this test should work everywhere as far as PR is resolved.
I'll put xfail into the test.
Which will lead to (in g++.log):
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__' attribute 
does not apply to types\
  [-Wattributes]^M
output is:
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__' attribute 
does not apply to types\
  [-Wattributes]^M

XFAIL: c-c++-common/attr-simd-3.c  -std=gnu++98 PR68158 (test for errors, line 
5)
FAIL: c-c++-common/attr-simd-3.c  -std=gnu++98 (test for excess errors)
Excess errors:
gcc/testsuite/c-c++-common/attr-simd-3.c:5:48: warning: '__simd__' attribute 
does not apply to types\
  [-Wattributes]

Patch in the bottom.

gcc/tessuite/
* c-c++-common/attr-simd-3.c: Put xfail (PR68158) on dg-error.


This test fails on bare-metal targets that don't support -fcilkplus or -pthread.
Would you consider moving them to the cilkplus testing directory or adding an 
appropriate
effective target check?

Thanks,
Kyrill


* c-c++-common/attr-simd.c: Limit scan of dump to x86_64/i?86.


Andreas.

--
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

diff --git a/gcc/testsuite/c-c++-common/attr-simd-3.c 
b/gcc/testsuite/c-c++-common/attr-simd-3.c
index 2bbdf04..35dd4c0 100644
--- a/gcc/testsuite/c-c++-common/attr-simd-3.c
+++ b/gcc/testsuite/c-c++-common/attr-simd-3.c
@@ -2,4 +2,4 @@
  /* { dg-options "-fcilkplus" } */
  /* { dg-prune-output "undeclared here \\(not in a function\\)|\[^\n\r\]* was not 
declared in this scope" } */

-void f () __attribute__((__simd__, __vector__)); /* { dg-error "in the same 
function marked as a Cilk Plus" } */
+void f () __attribute__((__simd__, __vector__)); /* { dg-error "in the same function marked 
as a Cilk Plus" "PR68158" { xfail c++ } } */
diff --git a/gcc/testsuite/c-c++-common/attr-simd.c 
b/gcc/testsuite/c-c++-common/attr-simd.c
index 61974e3..7674588 100644
--- a/gcc/testsuite/c-c++-common/attr-simd.c
+++ b/gcc/testsuite/c-c++-common/attr-simd.c
@@ -11,7 +11,7 @@ int simd_attr (void)
return 0;
  }

-/* { dg-final { scan-tree-dump "simd_attr\[ \\t\]simdclone|vector" "optimized" 
} } */
+/* { dg-final { scan-tree-dump "simd_attr\[ \\t\]simdclone|vector" "optimized" 
{ target { i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVbN4_simd_attr:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVbM4_simd_attr:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVcN4_simd_attr:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */
@@ -29,7 +29,7 @@ int simd_attr2 (void)
return 0;
  }

-/* { dg-final { scan-tree-dump "simd_attr2\[ \\t\]simdclone|vector" 
"optimized" } } */
+/* { dg-final { scan-tree-dump "simd_attr2\[ \\t\]simdclone|vector" 
"optimized" { target { i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVbN4_simd_attr2:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVbM4_simd_attr2:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */
  /* { dg-final { scan-assembler-times "_ZGVcN4_simd_attr2:" 1 { target { 
i?86-*-* x86_64-*-* } } } } */





[RFC] [Patch] PR67326 - relax trap assumption by looking at similar DRS

2015-11-20 Thread Kumar, Venkataramanan
Hi Richard,

As per Jakub suggestion in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67326, 
the below patch fixes the regression in tree if conversion.
Basically allowing if conversion to happen for a candidate DR, if we find 
similar DR with same dimensions  and that DR will not trap.

To find similar DRs using hash table to hashing the offset and DR pairs.  
Also reusing  read/written information that was stored for reference tree.  

Also. 
(1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-common. 
Sometimes vectorization flags also triggers if conversion.
(2) Also hashing base DRs for writes only. 

gcc/ChangeLog
2015-11-19  Venkataramanan  

PR tree-optimization/67326
* tree-if-conv.c  (offset_DR_map): Define.
(struct ifc_dr): Add new tree base_predicate field.
(hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash offsets, 
DR pairs
and hash base ref,  DR pairs  for write type DRs.
(ifcvt_memrefs_wont_trap):  Guard checks with 
-ftree-loop-if-convert-stores flag. 
   Check for similar DR that are accessed unconditionally.
   (if_convertible_loop_p_1):  Initialize and delete offset hash maps

gcc/testsuite/ChangeLog
2015-11-19  Venkataramanan  
* gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.

Regstrapped on x86_64, Ok for trunk?

Regards,
Venkat.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c
new file mode 100644
index 000..5ca11cd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-details -fno-common 
-ftree-loop-if-convert-stores" } */
+
+float v0[1024];
+float v1[1024];
+float v2[1024];
+float v3[1024];
+
+void condAssign1 () {
+  for (int i=0; i<1024; ++i)
+v0[i] = (v2[i]>v1[i]) ? v2[i]*v3[i] : v0[i];
+}
+
+void condAssign2 () {
+  for (int i=0; i<1024; ++i)
+v0[i] = (v2[i]>v1[i]) ? v2[i]*v1[i] : v0[i];
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 2 "ifcvt" } } */
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 01065cb..1b4d220 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -124,6 +124,9 @@ static hash_map 
*ref_DR_map;
 /* Hash table to store base reference, DR pairs.  */
 static hash_map *baseref_DR_map;
 
+/* Hash table to store DR offset, DR pairs.  */
+static hash_map *offset_DR_map;
+
 /* Structure used to predicate basic blocks.  This is attached to the
->aux field of the BBs in the loop to be if-converted.  */
 struct bb_predicate {
@@ -589,6 +592,8 @@ struct ifc_dr {
   int rw_unconditionally;
 
   tree predicate;
+
+  tree base_predicate;
 };
 
 #define IFC_DR(DR) ((struct ifc_dr *) (DR)->aux)
@@ -609,11 +614,12 @@ static void
 hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
 {
 
-  data_reference_p *master_dr, *base_master_dr;
+  data_reference_p *master_dr, *base_master_dr, *offset_master_dr;
   tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  tree offset = DR_OFFSET (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
-  bool exist1, exist2;
+  bool exist1, exist2, exist3;
 
   while (TREE_CODE (ref) == COMPONENT_REF
 || TREE_CODE (ref) == IMAGPART_EXPR
@@ -636,22 +642,34 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info 
(data_reference_p a)
   if (is_true_predicate (IFC_DR (*master_dr)->predicate))
 DR_RW_UNCONDITIONALLY (*master_dr) = 1;
 
-  base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);
-
-  if (!exist2)
+  if (DR_IS_WRITE (a))
 {
-  IFC_DR (a)->predicate = ca;
-  *base_master_dr = a;
+  base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);
+  if (!exist2)
+   {
+ IFC_DR (a)->base_predicate = ca;
+ *base_master_dr = a;
+   }
+  else
+   IFC_DR (*base_master_dr)->base_predicate
+ = fold_or_predicates
+   (EXPR_LOCATION (IFC_DR (*base_master_dr)->base_predicate),
+ca, IFC_DR (*base_master_dr)->base_predicate);
+
+  if (is_true_predicate (IFC_DR (*base_master_dr)->base_predicate))
+   DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) = 1;
 }
-  else
-IFC_DR (*base_master_dr)->predicate
-   = fold_or_predicates
-   (EXPR_LOCATION (IFC_DR (*base_master_dr)->predicate),
-ca, IFC_DR (*base_master_dr)->predicate);
 
-  if (DR_IS_WRITE (a)
-  && (is_true_predicate (IFC_DR (*base_master_dr)->predicate)))
-DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) = 1;
+  if (offset)
+{
+  offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
+  if (!exist3)
+   *offset_master_dr = a;
+
+  if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
+   DR_RW_UNCONDITIONALLY (*offset_master_dr)
+   = DR_RW_UNCONDITIONALLY (*master_dr);
+}
 }
 
 /* Return true when the memory references of STMT won't trap in the
@@ -682,7 

Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2015-11-20 Thread James Greenhalgh
On Thu, Nov 19, 2015 at 04:04:41PM -0600, Evandro Menezes wrote:
> On 11/05/2015 02:51 PM, Evandro Menezes wrote:
> >2015-11-05  Evandro Menezes 
> >
> >   gcc/
> >
> >   * config/aarch64/aarch64.c (aarch64_override_options_internal):
> >   Increase loop peeling limit.
> >
> >This patch increases the limit for the number of peeled insns.
> >With this change, I noticed no major regression in either
> >Geekbench v3 or SPEC CPU2000 while some benchmarks, typically FP
> >ones, improved significantly.
> >
> >I tested this tuning on Exynos M1 and on A57.  ThunderX seems to
> >benefit from this tuning too.  However, I'd appreciate comments
> >from other stakeholders.
> 
> Ping.

I'd like to leave this for a call from the port maintainers. I can see why
this leads to more opportunities for vectorization, but I'm concerned about
the wider impact on code size. Certainly I wouldn't expect this to be our
default at -O2 and below.

My gut feeling is that this doesn't really belong in the back-end (there are
presumably good reasons why the default for this parameter across GCC has
fluctuated from 400 to 100 to 200 over recent years), but as I say, I'd
like Marcus or Richard to make the call as to whether or not we take this
patch.

For now, I'd drop it from the series (it stands alone anyway).

Thanks,
James



[Committed] S/390: Add bswaphi2 pattern

2015-11-20 Thread Andreas Krebbel
Hi,

the 16 bit bswap instruction support was missing in the back-end so
far.  Added with the attached patch.

Bootstrapped on s390 and s390x. No regressions.

Bye,

-Andreas-

gcc/testsuite/ChangeLog:

2015-11-20  Andreas Krebbel  

* gcc.target/s390/bswap-1.c: New test.


gcc/ChangeLog:

2015-11-20  Andreas Krebbel  

* config/s390/s390.md ("bswaphi2"): New pattern.

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index ea65c74..280a772 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10439,6 +10439,8 @@
 ; Byte swap instructions
 ;
 
+; FIXME: There is also mvcin but we cannot use it since src and target
+; may overlap.
 (define_insn "bswap2"
   [(set (match_operand:GPR 0"register_operand" "=d, d")
(bswap:GPR (match_operand:GPR 1 "nonimmediate_operand" " d,RT")))]
@@ -10450,6 +10452,14 @@
(set_attr "op_type" "RRE,RXY")
(set_attr "z10prop" "z10_super")])
 
+(define_insn "bswaphi2"
+  [(set (match_operand:HI 0   "register_operand" "=d")
+   (bswap:HI (match_operand:HI 1 "memory_operand"   "RT")))]
+  "TARGET_CPU_ZARCH"
+  "lrvh\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "op_type" "RXY")
+   (set_attr "z10prop" "z10_super")])
 
 ;
 ; Population count instruction
diff --git a/gcc/testsuite/gcc.target/s390/bswap-1.c 
b/gcc/testsuite/gcc.target/s390/bswap-1.c
new file mode 100644
index 000..e1f113a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/bswap-1.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z900 -mzarch" } */
+
+#include 
+
+uint64_t u64;
+uint32_t u32;
+uint16_t u16;
+
+uint64_t
+foo64a (uint64_t a)
+{
+  return __builtin_bswap64 (a);
+}
+/* { dg-final { scan-assembler-times "lrvgr\t%r2,%r2" 1 { target lp64 } } } */
+
+uint64_t
+foo64b ()
+{
+  return __builtin_bswap64 (u64);
+}
+/* { dg-final { scan-assembler-times "lrvg\t%r2,0\\(%r\[0-9\]*\\)" 1 { target 
lp64 } } } */
+
+uint32_t
+foo32 ()
+{
+  return __builtin_bswap32 (u32);
+}
+/* { dg-final { scan-assembler-times "lrv\t%r2,0\\(%r\[0-9\]*\\)" 1 } } */
+
+uint16_t
+foo16 ()
+{
+  return __builtin_bswap16 (u16);
+}
+/* { dg-final { scan-assembler-times "lrvh\t%r2,0\\(%r\[0-9\]*\\)" 1 } } */



Re: [PATCH][GCC][ARM] Disable neon testing for armv7-m

2015-11-20 Thread Kyrill Tkachov

Hi Andre,

On 18/11/15 09:44, Andre Vieira wrote:

On 17/11/15 10:10, James Greenhalgh wrote:

On Mon, Nov 16, 2015 at 01:15:32PM +, Andre Vieira wrote:

On 16/11/15 12:07, James Greenhalgh wrote:

On Mon, Nov 16, 2015 at 10:49:11AM +, Andre Vieira wrote:

Hi,

   This patch changes the target support mechanism to make it
recognize any ARM 'M' profile as a non-neon supporting target. The
current check only tests for armv6 architectures and earlier, and
does not account for armv7-m.

   This is correct because there is no 'M' profile that supports neon
and the current test is not sufficient to exclude armv7-m.

   Tested by running regressions for this testcase for various ARM targets.

   Is this OK to commit?

   Thanks,
   Andre Vieira

gcc/testsuite/ChangeLog:
2015-11-06  Andre Vieira 

 * gcc/testsuite/lib/target-supports.exp
   (check_effective_target_arm_neon_ok_nocache): Added check
   for M profile.



 From 2c53bb9ba3236919ecf137a4887abf26d4f7fda2 Mon Sep 17 00:00:00 2001
From: Andre Simoes Dias Vieira 
Date: Fri, 13 Nov 2015 11:16:34 +
Subject: [PATCH] Disable neon testing for armv7-m

---
  gcc/testsuite/lib/target-supports.exp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 
75d506829221e3d02d454631c4bd2acd1a8cedf2..8097a4621b088a93d58d09571cf7aa27b8d5fba6
 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2854,7 +2854,7 @@ proc check_effective_target_arm_neon_ok_nocache { } {
  int dummy;
  /* Avoid the case where a test adds -mfpu=neon, but the toolchain is
 configured for -mcpu=arm926ej-s, for example.  */
-#if __ARM_ARCH < 7
+#if __ARM_ARCH < 7 || __ARM_ARCH_PROFILE == 'M'
  #error Architecture too old for NEON.


Could you fix this #error message while you're here?

Why we can't change this test to look for the __ARM_NEON macro from ACLE:

#if __ARM_NEON < 1
   #error NEON is not enabled
#endif

Thanks,
James



There is a check for this already:
'check_effective_target_arm_neon'. I think the idea behind
arm_neon_ok is to check whether the hardware would support neon,
whereas arm_neon is to check whether neon was enabled, i.e.
-mfpu=neon was used or a mcpu was passed that has neon enabled by
default.

The comments for 'check_effective_target_arm_neon_ok_nocache'
highlight this, though maybe the comments for
check_effective_target_arm_neon could be better.

# Return 1 if this is an ARM target supporting -mfpu=neon
# -mfloat-abi=softfp or equivalent options.  Some multilibs may be
# incompatible with these options.  Also set et_arm_neon_flags to the
# best options to add.

proc check_effective_target_arm_neon_ok_nocache
...
/* Avoid the case where a test adds -mfpu=neon, but the toolchain is
configured for -mcpu=arm926ej-s, for example.  */
...


and

# Return 1 if this is a ARM target with NEON enabled.

proc check_effective_target_arm_neon


OK, got it - sorry for my mistake, I had the two procs confused.

I'd still like to see the error message fixed "Architecture too old for NEON."
is not an accurate description of the problem.

Thanks,
James



This OK?



This is ok,
I've committed for you with the slightly tweaked ChangeLog entry:
2015-11-20  Andre Vieira  

* lib/target-supports.exp
(check_effective_target_arm_neon_ok_nocache): Add check
for M profile.

as r230653.

Thanks,
Kyrill



Cheers,
Andre




Re: [PATCH] Add clang-format config to contrib folder

2015-11-20 Thread Pedro Alves
On 11/20/2015 11:33 AM, Martin Liška wrote:

> Hi Pedro.

Hi Martin.

> Fully agree with you, there's suggested patch.
> Hope I can install the patch for trunk?

I'd call it obvious.  :-)

Thanks,
Pedro Alves



Re: [PATCH] Add clang-format config to contrib folder

2015-11-20 Thread Martin Liška
On 11/19/2015 06:43 PM, Pedro Alves wrote:
> On 11/19/2015 12:54 PM, Martin Liška wrote:
>>  ContinuationIndentWidth: 2
>> -ForEachMacros: 
>> ['_FOR_EACH','_FOR_EACH_1','FOR_EACH_AGGR_INIT_EXPR_ARG','FOR_EACH_ALIAS','FOR_EACH_ALLOCNO','FOR_EACH_ALLOCNO_OBJECT','FOR_EACH_ARTIFICIAL_DEF','FOR_EACH_ARTIFICIAL_USE','FOR_EACH_BB_FN','FOR_EACH_BB_REVERSE_FN','FOR_EACH_BIT_IN_MINMAX_SET','FOR_EACH_CALL_EXPR_ARG','FOR_EACH_CLONE','FOR_EACH_CONST_CALL_EXPR_ARG','FOR_EACH_CONSTRUCTOR_ELT','FOR_EACH_CONSTRUCTOR_VALUE','FOR_EACH_COPY','FOR_EACH_DEF','FOR_EACH_DEFINED_FUNCTION','FOR_EACH_DEFINED_SYMBOL','FOR_EACH_DEFINED_VARIABLE','FOR_EACH_DEP','FOR_EACH_EDGE','FOR_EACH_EXPR','FOR_EACH_EXPR_1','FOR_EACH_FUNCTION','FOR_EACH_FUNCTION_WITH_GIMPLE_BODY','FOR_EACH_HASH_TABLE_ELEMENT','FOR_EACH_IMM_USE_FAST','FOR_EACH_IMM_USE_ON_STMT','FOR_EACH_IMM_USE_STMT','FOR_EACH_INSN','FOR_EACH_INSN_1','FOR_EACH_INSN_DEF','FOR_EACH_INSN_EQ_USE','FOR_EACH_INSN_INFO_DEF','FOR_EACH_INSN_INFO_EQ_USE','FOR_EACH_INSN_INFO_MW','FOR_EACH_INSN_INFO_USE','FOR_EACH_INSN_USE','FOR_EACH_LOCAL_DECL','FOR_EACH_LOOP','FOR_EACH_LOOP_FN','FOR!
>  _EACH_OBJE
> CT','FOR_EACH_OBJECT_CONFLICT','FOR_EACH_PHI_ARG','FOR_EACH_PHI_OR_STMT_DEF','FOR_EACH_PHI_OR_STMT_USE','FOR_EACH_PREF','FOR_EACH_SCALAR','FOR_EACH_SSA_DEF_OPERAND','FOR_EACH_SSA_TREE_OPERAND','FOR_EACH_SSA_USE_OPERAND','FOR_EACH_STATIC_INITIALIZER','FOR_EACH_SUBRTX','FOR_EACH_SUBRTX_PTR','FOR_EACH_SUBRTX_VAR','FOR_EACH_SUCC','FOR_EACH_SUCC_1','FOR_EACH_SYMBOL','FOR_EACH_VARIABLE','FOR_EACH_VEC_ELT','FOR_EACH_VEC_ELT_FROM','FOR_EACH_VEC_ELT_REVERSE','FOR_EACH_VEC_SAFE_ELT','FOR_EACH_VEC_SAFE_ELT_REVERSE','_FOR_EACH_X','_FOR_EACH_X_1','FOREACH_FUNCTION_ARGS','FOREACH_FUNCTION_ARGS_PTR']
>> +ForEachMacros: 
>> ['FOR_ALL_BB_FN','FOR_ALL_EH_REGION','FOR_ALL_EH_REGION_AT','FOR_ALL_EH_REGION_FN','FOR_ALL_INHERITED_FIELDS','FOR_ALL_PREDICATES','FOR_BB_BETWEEN','FOR_BB_INSNS','FOR_BB_INSNS_REVERSE','FOR_BB_INSNS_REVERSE_SAFE','FOR_BB_INSNS_SAFE','FOR_BODY','FOR_COND','FOR_EACH_AGGR_INIT_EXPR_ARG','FOR_EACH_ALIAS','FOR_EACH_ALLOCNO','FOR_EACH_ALLOCNO_OBJECT','FOR_EACH_ARTIFICIAL_DEF','FOR_EACH_ARTIFICIAL_USE','FOR_EACH_BB_FN','FOR_EACH_BB_REVERSE_FN','FOR_EACH_BIT_IN_MINMAX_SET','FOR_EACH_CALL_EXPR_ARG','FOR_EACH_CLONE','FOR_EACH_CONST_CALL_EXPR_ARG','FOR_EACH_CONSTRUCTOR_ELT','FOR_EACH_CONSTRUCTOR_VALUE','FOR_EACH_COPY','FOR_EACH_DEF','FOR_EACH_DEFINED_FUNCTION','FOR_EACH_DEFINED_SYMBOL','FOR_EACH_DEFINED_VARIABLE','FOR_EACH_DEP','FOR_EACH_EDGE','FOR_EACH_EXPR','FOR_EACH_EXPR_1','FOR_EACH_FUNCTION','FOREACH_FUNCTION_ARGS','FOREACH_FUNCTION_ARGS_PTR','FOR_EACH_FUNCTION_WITH_GIMPLE_BODY','FOR_EACH_HASH_TABLE_ELEMENT','FOR_EACH_IMM_USE_FAST','FOR_EACH_IMM_USE_ON_STMT','FO!
>  R_EACH_IMM
> _USE_STMT','FOR_EACH_INSN','FOR_EACH_INSN_1','FOR_EACH_INSN_DEF','FOR_EACH_INSN_EQ_USE','FOR_EACH_INSN_INFO_DEF','FOR_EACH_INSN_INFO_EQ_USE','FOR_EACH_INSN_INFO_MW','FOR_EACH_INSN_INFO_USE','FOR_EACH_INSN_USE','FOR_EACH_LOCAL_DECL','FOR_EACH_LOOP','FOR_EACH_LOOP_FN','FOR_EACH_OBJECT','FOR_EACH_OBJECT_CONFLICT','FOR_EACH_PHI_ARG','FOR_EACH_PHI_OR_STMT_DEF','FOR_EACH_PHI_OR_STMT_USE','FOR_EACH_PREF','FOR_EACH_SCALAR','FOR_EACH_SSA_DEF_OPERAND','FOR_EACH_SSA_TREE_OPERAND','FOR_EACH_SSA_USE_OPERAND','FOR_EACH_STATIC_INITIALIZER','FOR_EACH_SUBRTX','FOR_EACH_SUBRTX_PTR','FOR_EACH_SUBRTX_VAR','FOR_EACH_SUCC','FOR_EACH_SUCC_1','FOR_EACH_SYMBOL','FOR_EACH_VARIABLE','FOR_EACH_VEC_ELT','FOR_EACH_VEC_ELT_FROM','FOR_EACH_VEC_ELT_REVERSE','FOR_EACH_VEC_SAFE_ELT','FOR_EACH_VEC_SAFE_ELT_REVERSE','FOR_EXPR','FOR_INIT_STMT','FOR_SCOPE']
>>  IndentCaseLabels: false
> 
> I don't know the tool's syntax here, but it's usually much better to
> keep entries like these one per line (or at least a few only).  Then
> changes to the list are _much_ easier to review and maintain, like:
> 
> ...
>  ForEachMacros: [
>  'FOR_ALL_BB_FN',
>  'FOR_ALL_EH_REGION',
> -'FOR_ALL_EH_REGION_AT',
> -'FOR_ALL_EH_REGION',
> +'FOR_ALL_EH_WHATNOT,
> +'FOR_ALL_EH_WHATNOT_AT,
>  'FOR_ALL_INHERITED_FIELDS',
>  'FOR_ALL_PREDICATES',
>  'FOR_BB_BETWEEN',
>  'FOR_BB_INSNS',
> ...
> 
> vs a single-line hunk that's basically unintelligible.
> 
> Thanks,
> Pedro Alves
> 

Hi Pedro.

Fully agree with you, there's suggested patch.
Hope I can install the patch for trunk?

Thanks,
Martin
From eac730138b51db897f579a961aabaee805338bdd Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 20 Nov 2015 12:29:49 +0100
Subject: [PATCH] clang-format: split content of a list to multiple lines

contrib/ChangeLog:

2015-11-20  Martin Liska  

	* clang-format: Split content of a list to multiple
	lines.
---
 contrib/clang-format | 85 +++-
 1 file changed, 84 insertions(+), 1 deletion(-)

diff --git a/contrib/clang-format b/contrib/clang-format
index 76a9d87..d734001 100644
--- a/contrib/clang-format
+++ b/contrib/clang-format
@@ -43,7 +43,90 @@ BreakBeforeTernaryOperators: true
 ColumnLimit: 80
 ConstructorInitializerIndentWidth: 2
 ContinuationInde

Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-20 Thread Ilya Enkovich
2015-11-20 14:28 GMT+03:00 Richard Biener :
> On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich  wrote:
>> 2015-11-18 16:44 GMT+03:00 Richard Biener :
>>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich  
>>> wrote:
 Hi,

 When we compute vectypes we skip non-relevant phi nodes.  But we process 
 non-relevant alive statements and thus may need vectype of non-relevant 
 live phi node to compute mask vectype.  This patch enables vectype 
 computation for live phi nodes.  Botostrapped and regtested on 
 x86_64-unknown-linux-gnu.  OK for trunk?
>>>
>>> Hmm.  What breaks if you instead skip all !relevant stmts and not
>>> compute vectype for life but not relevant ones?  We won't ever
>>> "vectorize" !relevant ones, that is, we don't need their vector type.
>>
>> I tried it and got regression in SLP.  It expected non-null vectype
>> for non-releveant but live statement. Regression was in
>> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90
>
> Because somebody put a vector type check before
>
>   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> return false;
>
> @@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
>tree mask_type;
>tree mask;
>
> +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> +return false;
> +
>if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>  return false;
>
> @@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
>  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>
>gcc_assert (ncopies >= 1);
> -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> -return false;
>
>if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
>&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>
> fixes this particular fallout for me.

I'll try it.

Thanks,
Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Richard.
>>>
 Thanks,
 Ilya


Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-20 Thread Richard Biener
On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich  wrote:
> 2015-11-18 16:44 GMT+03:00 Richard Biener :
>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich  
>> wrote:
>>> Hi,
>>>
>>> When we compute vectypes we skip non-relevant phi nodes.  But we process 
>>> non-relevant alive statements and thus may need vectype of non-relevant 
>>> live phi node to compute mask vectype.  This patch enables vectype 
>>> computation for live phi nodes.  Botostrapped and regtested on 
>>> x86_64-unknown-linux-gnu.  OK for trunk?
>>
>> Hmm.  What breaks if you instead skip all !relevant stmts and not
>> compute vectype for life but not relevant ones?  We won't ever
>> "vectorize" !relevant ones, that is, we don't need their vector type.
>
> I tried it and got regression in SLP.  It expected non-null vectype
> for non-releveant but live statement. Regression was in
> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90

Because somebody put a vector type check before

  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
return false;

@@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
   tree mask_type;
   tree mask;

+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
   if (!VECTOR_BOOLEAN_TYPE_P (vectype))
 return false;

@@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
 ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;

   gcc_assert (ncopies >= 1);
-  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
-return false;

   if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
   && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle

fixes this particular fallout for me.

Richard.

> Ilya
>
>>
>> Richard.
>>
>>> Thanks,
>>> Ilya


Re: [PATCH] Fix memory leaks in tree-ssa-uninit.c

2015-11-20 Thread Martin Liška
On 11/20/2015 03:14 AM, Bernd Schmidt wrote:
> BTW, I'm with whoever said absolutely no way to the idea of making automatic 
> changes like this as part of a commit hook.
> 
> I think the whitespace change can go in if it hasn't already, but I think the 
> other one still has enough problems that I'll say - leave it for the next 
> stage 1.
> 
>> @@ -178,8 +173,9 @@ warn_uninitialized_vars (bool 
>> warn_possibly_uninitialized)
>>
>>FOR_EACH_BB_FN (bb, cfun)
>>  {
>> -  bool always_executed = dominated_by_p (CDI_POST_DOMINATORS,
>> - single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)), bb);
>> +  bool always_executed
>> += dominated_by_p (CDI_POST_DOMINATORS,
>> +  single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)), bb);
>>for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>  {
> 
> Better to pull the single_succ into its own variable perhaps?
> 
>> @@ -1057,7 +1039,8 @@ prune_uninit_phi_opnds_in_unrealizable_paths (gphi 
>> *phi,
>>   *visited_flag_phis = BITMAP_ALLOC (NULL);
>>
>> if (bitmap_bit_p (*visited_flag_phis,
>> -SSA_NAME_VERSION (gimple_phi_result (flag_arg_def
>> +SSA_NAME_VERSION (
>> +  gimple_phi_result (flag_arg_def
>>   return false;
>>
>> bitmap_set_bit (*visited_flag_phis,
> 
> Pull the gimple_phi_result into a separate variable, or just leave it 
> unchanged for now, the modified version is too ugly to live.
> 
>> bitmap_clear_bit (*visited_flag_phis,
>> -SSA_NAME_VERSION (gimple_phi_result (flag_arg_def)));
>> +SSA_NAME_VERSION (
>> +  gimple_phi_result (flag_arg_def)));
>> continue;
>>   }
> 
> Here too.
> 
>> -  all_pruned = prune_uninit_phi_opnds_in_unrealizable_paths (phi,
>> - uninit_opnds,
>> - as_a  (flag_def),
>> - boundary_cst,
>> - cmp_code,
>> - visited_phis,
>> - &visited_flag_phis);
>> +  all_pruned = prune_uninit_phi_opnds_in_unrealizable_paths
>> +(phi, uninit_opnds, as_a (flag_def), boundary_cst, cmp_code,
>> + visited_phis, &visited_flag_phis);
> 
> I'd rather shorten the name of the function, even if it goes against the 
> spirit of the novel writing month.
> 
>> -  if (gphi *use_phi = dyn_cast  (use_stmt))
>> -use_bb = gimple_phi_arg_edge (use_phi,
>> -  PHI_ARG_INDEX_FROM_USE (use_p))->src;
>> +  if (gphi *use_phi = dyn_cast (use_stmt))
>> +use_bb
>> +  = gimple_phi_arg_edge (use_phi, PHI_ARG_INDEX_FROM_USE (use_p))->src;
>> else
>>   use_bb = gimple_bb (use_stmt);
> 
> There are some changes of this nature and I'm not sure it's an improvement. 
> Leave them out for now?
> 
>> @@ -2345,8 +2309,8 @@ warn_uninitialized_phi (gphi *phi, vec 
>> *worklist,
>>   }
>>
>> /* Now check if we have any use of the value without proper guard.  */
>> -  uninit_use_stmt = find_uninit_use (phi, uninit_opnds,
>> - worklist, added_to_worklist);
>> +  uninit_use_stmt
>> += find_uninit_use (phi, uninit_opnds, worklist, added_to_worklist);
>>
>> /* All uses are properly guarded.  */
>> if (!uninit_use_stmt)
> 
> Here too.
> 
>> @@ -2396,12 +2359,24 @@ public:
>> {}
>>
>> /* opt_pass methods: */
>> -  opt_pass * clone () { return new pass_late_warn_uninitialized (m_ctxt); }
>> -  virtual bool gate (function *) { return gate_warn_uninitialized (); }
>> +  opt_pass *clone ();
>> +  virtual bool gate (function *);
> 
> This may technically violate our coding standards, but it's consistent with a 
> lot of similar cases. Since coding standards are about enforcing consistency, 
> I'd drop this change.
> 
>>   namespace {
>>
>>   const pass_data pass_data_early_warn_uninitialized =
>>   {
>> -  GIMPLE_PASS, /* type */
>> +  GIMPLE_PASS,   /* type */
>> "*early_warn_uninitialized", /* name */
>> -  OPTGROUP_NONE, /* optinfo_flags */
>> -  TV_TREE_UNINIT, /* tv_id */
>> -  PROP_ssa, /* properties_required */
>> -  0, /* properties_provided */
>> -  0, /* properties_destroyed */
>> -  0, /* todo_flags_start */
>> -  0, /* todo_flags_finish */
>> +  OPTGROUP_NONE,   /* optinfo_flags */
>> +  TV_TREE_UNINIT,   /* tv_id */
>> +  PROP_ssa,   /* properties_required */
>> +  0,   /* properties_provided */
>> +  0,   /* properties_destroyed */
>> +  0,   /* todo_flags_start */
>> +  0,   /* todo_flags_finish */
>>   };
> 
> Likewise. Seems to be done practically nowhere.
> 
>>   class pass_early_warn_uninitialized : public gimple_opt_pass
>> @@ -2519,14 +2491,23 @@ public:
>> {}
>>
>> /* opt_pass methods: */
>> -  virtual bool gate (function *) { return gate_warn_uninitializ

Re: Add uaddv4_optab, usubv4_optab

2015-11-20 Thread Richard Henderson

On 11/20/2015 11:56 AM, Eric Botcazou wrote:

Eric has just submitted a documentation path that documented the
{add,sub,mul,umul}v4 and negv3 patterns, so this should be
applied on top of that.


OK, I'm going to apply it, thanks.


Thanks.


Note that the comment at the beginning
of expand_addsub_overflow describing the overall strategy ought to be adjusted
if new patterns for the jump on carry are added.


Ok, I'll do that.


r~



Re: Add uaddv4_optab, usubv4_optab

2015-11-20 Thread Richard Henderson

On 11/20/2015 11:43 AM, Jakub Jelinek wrote:

+(define_expand "uaddv4"
+  [(parallel [(set (reg:CCC FLAGS_REG)
+  (compare:CCC
+(plus:SWI (match_dup 1) (match_dup 2))
+(match_dup 1)))
+ (set (match_dup 0)
+  (plus:SWI (match_dup 1) (match_dup 2)))])
+   (set (pc) (if_then_else
+  (ne (reg:CCC FLAGS_REG) (const_int 0))
+  (label_ref (match_operand 3))
+  (pc)))]
+  ""
+{
+  ix86_fixup_binary_operands_no_copy (PLUS, mode, operands);
+})


Do we need this one on i?86?  I'm not against adding it to optabs, so that
other targets have a way to improve that, but doesn't combine handle this
case on i?86 already well?


Perhaps combine can do the job, but in my option it's better to have the optab 
than not.  Especially when it's so easy to add in this case.



+(define_expand "usubv4"
+  [(parallel [(set (reg:CC FLAGS_REG)
+  (compare:CC (match_dup 1) (match_dup 2)))
+ (set (match_dup 0)
+  (minus:SWI (match_dup 1) (match_dup 2)))])
+   (set (pc) (if_then_else
+  (ltu (reg:CC FLAGS_REG) (const_int 0))
+  (label_ref (match_operand 3))
+  (pc)))]


If this works, it will be nice, I thought we'll need a new CC*mode.


No, we just need to re-use the existing insn that performs the low half of a 
double-word subtraction operation.



Eric has just submitted a documentation path that documented the
{add,sub,mul,umul}v4 and negv3 patterns, so this should be
applied on top of that.


Ok, I'll look out for that.


r~


Re: [PATCH 2/5] [AARCH64] Change IMP and PART over to integers from strings.

2015-11-20 Thread Kyrill Tkachov

Hi Andrew,

On 17/11/15 22:10, Andrew Pinski wrote:

Because the imp and parts are really integer rather than strings, this patch
moves the comparisons to be integer.  Also allows saving around integers are
easier than doing string comparisons.  This allows for the next change.

The way I store BIG.little is (big<<12)|little as each part num is only 12bits
long.  So it would be nice if someone could test -mpu=native on a big.little
system to make sure it works still.

OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

Thanks,
Andrew Pinski



* config/aarch64/aarch64-cores.def: Rewrite so IMP and PART are integer 
constants.
* config/aarch64/driver-aarch64.c (struct aarch64_core_data): Change 
implementer_id to unsigned char.
Change part_no to unsigned int.
(AARCH64_BIG_LITTLE): New define.
(INVALID_IMP): New define.
(INVALID_CORE): New define.
(cpu_data): Change the last element's implementer_id and part_no to integers.
(valid_bL_string_p): Rewrite to ..
(valid_bL_core_p): this for integers instead of strings.
(parse_field): New function.
(contains_string_p): Rewrite to ...
(contains_core_p): this for integers and only for the part_no.
(host_detect_local_cpu): Rewrite handling of implementation and part num to be 
integers;
simplifying the code.
---
  gcc/config/aarch64/aarch64-cores.def | 25 +-
  gcc/config/aarch64/driver-aarch64.c  | 90 
  2 files changed, 62 insertions(+), 53 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 0b456f7..798f3e3 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -33,25 +33,26 @@
 This need not include flags implied by the architecture.
 COSTS is the name of the rtx_costs routine to use.
 IMP is the implementer ID of the CPU vendor.  On a GNU/Linux system it can
-   be found in /proc/cpuinfo.
+   be found in /proc/cpuinfo.  There is a list in the ARM ARM.
 PART is the part number of the CPU.  On a GNU/Linux system it can be found
-   in /proc/cpuinfo.  For big.LITTLE systems this should have the form at of
-   ".".  */
+   in /proc/cpuinfo.  For big.LITTLE systems this should use the macro 
AARCH64_BIG_LITTLE
+   where the big part number comes as the first arugment to the macro and 
little is the
+   second.  */
  
  /* V8 Architecture Processors.  */
  
-AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03")

-AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, 
cortexa57, "0x41", "0xd07")
-AARCH64_CORE("cortex-a72",  cortexa72, cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, 
cortexa72, "0x41", "0xd08")
-AARCH64_CORE("exynos-m1",   exynosm1,  cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | 
AARCH64_FL_CRYPTO, cortexa72, "0x53", "0x001")
-AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | 
AARCH64_FL_CRYPTO, cortexa57, "0x51", "0x800")
-AARCH64_CORE("thunderx",thunderx,  thunderx,  8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | 
AARCH64_FL_CRYPTO, thunderx,  "0x43", "0x0a1")
-AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, xgene1, 
"0x50", "0x000")
+AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa53, 0x41, 0xd03)
+AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa57, 0x41, 0xd07)
+AARCH64_CORE("cortex-a72",  cortexa72, cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa72, 0x41, 0xd08)
+AARCH64_CORE("exynos-m1",   exynosm1,  cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa72, 0x53, 0x001)
+AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, 0x51, 0x800)
+AARCH64_CORE("thunderx",thunderx,  thunderx,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a1)
+AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
xgene1, 0x50, 0x000)
  
  /* V8 big.LITTLE implementations.  */
  
-AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03")

-AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa72, "0x41", "0xd08.0xd03")
+AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE(0xd07, 0xd03))
+AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, 0x41, AARCH64_BIG_LITTLE(0xd08, 0xd03))
  
  
  #undef AARCH64_CORE

diff --git a/gcc/config/aarch64/driver-aarch64.c 
b/gcc/config/aarch64/driver-aarch64.c

Re: Add uaddv4_optab, usubv4_optab

2015-11-20 Thread Eric Botcazou
> Eric has just submitted a documentation path that documented the
> {add,sub,mul,umul}v4 and negv3 patterns, so this should be
> applied on top of that.

OK, I'm going to apply it, thanks.  Note that the comment at the beginning 
of expand_addsub_overflow describing the overall strategy ought to be adjusted 
if new patterns for the jump on carry are added.

-- 
Eric Botcazou


Re: [PATCH] PR tree-optimization/68413 : Only check for integer cond reduction on analysis stage

2015-11-20 Thread Richard Biener
On Fri, Nov 20, 2015 at 10:24 AM, Alan Hayward  wrote:
> When vectorising a integer induction condition reduction,
> is_nonwrapping_integer_induction ends up with different values for base
> during the analysis and build phases. In the first it is an INTEGER_CST,
> in the second the loop has been vectorised out and the base is now a
> variable.
>
> This results in the analysis and build stage detecting the
> STMT_VINFO_VEC_REDUCTION_TYPE as different types.
>
> The easiest way to fix this is to only check for integer induction
> conditions on the analysis stage.

I don't like this.  For the evolution part we have added
STMT_VINFO_LOOP_PHI_EVOLUTION_PART.  If you now need
the original initial value as well then just save it.

Or if you really want to go with the hack then please do not call
is_nonwrapping_integer_induction with vec_stmt != NULL but
initialize cond_expr_is_nonwrapping_integer_induction from
STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)

The hack also lacks a comment.

Richard.

> gcc/
> PR tree-optimization/68413
> * tree-vect-loop.c (vectorizable_reduction): Only check for
> integer cond reduction on analysis stage.
>
>
>
>
> Thanks,
> Alan.
>


  1   2   >