Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Segher Boessenkool
On Wed, Apr 22, 2015 at 10:21:43AM +0800, Terry Guo wrote:
> gcc/ChangeLog:
> 2015-04-22 Hale Wang 
> Terry Guo  
> 
>PR rtl-optimization/64818
>* combine.c (can_combine_p): Don't combine user-specified register if
>it is in an asm input.
> 
> gcc/testsuite/ChangeLog:
> 2015-04-22 Hale Wang 
> Terry Guo  
> 
>PR rtl-optimization/64818
>* gcc.target/arm/pr64818.c: New.

This is okay for trunk, if it has been bootstrapped and regression tested.

Thanks,


Segher


Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Terry Guo
On Wed, Apr 22, 2015 at 9:44 AM, Segher Boessenkool
 wrote:
> On Tue, Apr 21, 2015 at 03:13:38PM +0800, Terry Guo wrote:
>> > Did you fix the comment?  REG_USERVAR_P and HARD_REGISTER_P can be
>> > set for more than just register asm.
>>
>> Sorry for missing the patch. I believe that I addressed your patch.
>> Please review it again to make sure my understanding is correct.
>
>> +  /* Use REG_USERVAR_P and HARD_REGISTER_P to check whether DEST is a user
>> + specified register, and do not eliminate such register if it is in an
>> + asm input.  Otherwise if allow such elimination, we may break the
>> + register asm usage defined in GCC manual.  */
>> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
>> +  && extract_asm_operands (PATTERN (i3)))
>> +return 0;
>
> The "to check whether DEST is a user-specified register" part is not
> correct; this check can for example also match for function arguments
> (which are hard regs) that were combined into any "normal" user var.
> I don't see how we would do a better check, and disallowing combination
> in this case is harmless (or even good); but the comment is misleading.
>
>
> Segher

Thanks for reviewing. Patch is updated per you suggestion. The
ChangeLog is also updated as below:

gcc/ChangeLog:
2015-04-22 Hale Wang 
Terry Guo  

   PR rtl-optimization/64818
   * combine.c (can_combine_p): Don't combine user-specified register if
   it is in an asm input.

gcc/testsuite/ChangeLog:
2015-04-22 Hale Wang 
Terry Guo  

   PR rtl-optimization/64818
   * gcc.target/arm/pr64818.c: New.
diff --git a/gcc/combine.c b/gcc/combine.c
index 6f0007a..6cd55dd 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1910,6 +1910,15 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
   set = expand_field_assignment (set);
   src = SET_SRC (set), dest = SET_DEST (set);
 
+  /* Do not eliminate user-specified register if it is in an
+ asm input because we may break the register asm usage defined
+ in GCC manual if allow to do so.
+ Be aware that this may cover more cases than we expect but this
+ should be harmless.  */
+  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
+  && extract_asm_operands (PATTERN (i3)))
+return 0;
+
   /* Don't eliminate a store in the stack pointer.  */
   if (dest == stack_pointer_rtx
   /* Don't combine with an insn that sets a register to itself if it has
diff --git a/gcc/testsuite/gcc.target/arm/pr64818.c 
b/gcc/testsuite/gcc.target/arm/pr64818.c
new file mode 100644
index 000..bddd846
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr64818.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+char temp[16];
+extern int foo1 (void);
+
+void foo (void)
+{
+  int i;
+  int len;
+
+  while (1)
+  {
+len = foo1 ();
+register int a asm ("r0") = 5;
+register char *b asm ("r1") = temp;
+register int c asm ("r2") = len;
+asm volatile ("mov %[r0], %[r0]\n  mov %[r1], %[r1]\n  mov %[r2], %[r2]\n"
+  : "+m"(*b)
+  : [r0]"r"(a), [r1]"r"(b), [r2]"r"(c));
+
+for (i = 0; i < len; i++)
+{
+  if (temp[i] == 10)
+  return;
+}
+  }
+}
+
+/* { dg-final { scan-assembler "\[\\t \]+mov\ r1,\ r1" } } */


Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state

2015-04-21 Thread Segher Boessenkool
On Tue, Apr 21, 2015 at 03:56:18PM -0500, Peter Bergner wrote:
> On Fri, 2015-03-20 at 17:41 -0500, Peter Bergner wrote:
> > On Fri, 2015-03-20 at 15:52 -0500, Segher Boessenkool wrote:
> > > Maybe it would be nicer if the builtin-expansion code handled the copy
> > > from cc, instead of stacking on RTL expanders.
> > 
> > That would allow getting rid of the expanders completely, which
> > would be nice.  I'd have to somehow add some type of RS6000_BTC_*
> > flag onto the builtin though, so I can tell during builtin expansion
> > whether I need to emit the cr copy code or not.
> 
> Ok, the patch below implements your suggestion.

It looks good, thanks.  Some minor comments...

> This patch also fixes some issues I hit with the tabortdc[i] and
> htm_m[ft]spr_ patterns when used with -m32 -mpowerpc64.

Running the testsuite, or did you actually try to _use_ -m32 -mpowerpc64?  :-)

> +(define_insn "tabortdc"
>[(set (match_operand:CC 3 "cc_reg_operand" "=x")
>   (unspec_volatile:CC [(match_operand 0 "u5bit_cint_operand" "n")
> -  (match_operand:SI 1 "gpc_reg_operand" "r")
> -  (match_operand:SI 2 "gpc_reg_operand" "r")]
> +  (match_operand:DI 1 "gpc_reg_operand" "r")
> +  (match_operand:DI 2 "gpc_reg_operand" "r")]
>   UNSPECV_HTM_TABORTDC))]
> -  "TARGET_HTM"
> +  "TARGET_POWERPC64 && TARGET_HTM"
>"tabortdc. %0,%1,%2"
>[(set_attr "type" "htm")
> (set_attr "length" "4")])

Maybe you can fold tabortdc with tabortwc now?  Use one UNSPEC name
for both, :GPR and ?

> +   case HTM_BUILTIN_TTEST: /* Alias for: tabortwci. 0,r0,0  */
> + op[nopnds++] = GEN_INT (0);
> + op[nopnds++] = gen_rtx_REG (SImode, 0);
> + op[nopnds++] = GEN_INT (0);

Is that really r0, isn't that (0|rA)?  [Too lazy to read the docs myself
right now, sorry.]

> + if (attr & RS6000_BTC_CR)
> +   {
> + if (fcode == HTM_BUILTIN_TBEGIN)
> +   {
> + /* Emit code to set TARGET to true or false depending on
> +whether the tbegin. instruction successfully or failed
> +to start a transaction.  We do this by placing the 1's
> +complement of CR's EQ bit into TARGET.  */
> + rtx scratch = gen_reg_rtx (SImode);
> + emit_insn (gen_rtx_SET (VOIDmode, scratch,
> + gen_rtx_EQ (SImode, cr,
> +  const0_rtx)));
> + emit_insn (gen_rtx_SET (VOIDmode, target,
> + gen_rtx_XOR (SImode, scratch,
> +  GEN_INT (1;
> +   }
> + else
> +   {
> + /* Emit code to copy the 4-bit condition register field
> +CR into the least significant end of register TARGET.  */
> + rtx scratch1 = gen_reg_rtx (SImode);
> + rtx scratch2 = gen_reg_rtx (SImode);
> + rtx subreg = simplify_gen_subreg (CCmode, scratch1, SImode, 0);
> + emit_insn (gen_movcc (subreg, cr));
> + emit_insn (gen_lshrsi3 (scratch2, scratch1, GEN_INT (28)));
> + emit_insn (gen_andsi3 (target, scratch2, GEN_INT (0xf)));
> +   }
> +   }

Don't we have helper functions/expanders to do these moves?  Yuck.

> -/* { dg-final { scan-assembler-times "tabortdc\\." 1 } } */
> -/* { dg-final { scan-assembler-times "tabortdci\\." 1 } } */
> +/* { dg-final { scan-assembler-times "tabortdc\\." 1 { target lp64 } } } */
> +/* { dg-final { scan-assembler-times "tabortdci\\." 1 { target lp64 } } } */

This skips this test on -m32 -mpowerpc64, is that on purpose?


Segher


Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.

2015-04-21 Thread Segher Boessenkool
On Tue, Apr 21, 2015 at 03:13:38PM +0800, Terry Guo wrote:
> > Did you fix the comment?  REG_USERVAR_P and HARD_REGISTER_P can be
> > set for more than just register asm.
> 
> Sorry for missing the patch. I believe that I addressed your patch.
> Please review it again to make sure my understanding is correct.

> +  /* Use REG_USERVAR_P and HARD_REGISTER_P to check whether DEST is a user
> + specified register, and do not eliminate such register if it is in an
> + asm input.  Otherwise if allow such elimination, we may break the
> + register asm usage defined in GCC manual.  */
> +  if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest)
> +  && extract_asm_operands (PATTERN (i3)))
> +return 0;

The "to check whether DEST is a user-specified register" part is not
correct; this check can for example also match for function arguments
(which are hard regs) that were combined into any "normal" user var.
I don't see how we would do a better check, and disallowing combination
in this case is harmless (or even good); but the comment is misleading.


Segher


Re: Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread David Malcolm
On Tue, 2015-04-21 at 20:14 +0200, Manuel López-Ibáñez wrote:
> On 21/04/15 18:07, David Malcolm wrote:
> >
> > I have the patch working now for the C++ frontend.  Am attaching the
> > work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
> > regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
> >63 new "PASS" results in gcc.sum
> >189 new "PASS" results in g++.sum
> > for the new test cases (relative to a control build of r48).
> >
> 
> I still do not understand why you need so much complexity as I explained 
> here: 
> https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00830.html
> 
> The attached patch passes all your tests except Wmisleading-indentation-3.c, 
> which warns only once instead of two times (it doesn't seem a big loss to 
> me), 
> and Wmisleading-indentation-7.c which I did not bother to implement but it is 
> straightforward application of the if-case to the else-case.

Aha!   Thanks.  Your approach is much simpler, and likely much faster.

> Perhaps I'm missing something that is not reflected in your tests?

No, mostly just my lack of expertise on the frontend :)

> BTW, the start-up cost of GCC is not negligible, thus grouping similar 
> testcases in a single file may pay off in the long term. Many small files 
> also 
> tend to slow down VC tools. It also makes harder to see what is tested and 
> what 
> is missing.

OK.  I'll finish up your version of the patch, and consolidate the
testcases.

Thanks
Dave



Re: [patch, fortran] PR 37131

2015-04-21 Thread Thomas Koenig
Hello Mikael and Dominique,

thanks for your helpful comments!

> To sum um, tests missing for the following:
>   array(4,:,:)
>   array(3:5,:)
>   array(3:10:2,:)
>   array(:,:)%comp
> with both lbound == 1 and lbound != 1.
> One test with lhs-rhs dependency would be good as well.

I have included those (and fixed the bugs that appeared).  This
is done in inline_matmul_1.f90 and in inline_matmul_5.f90.


> 
>> Index: fortran/array.c
>> ===
>> --- fortran/array.c  (Revision 18)
>> +++ fortran/array.c  (Arbeitskopie)
>> @@ -338,6 +338,9 @@ gfc_resolve_array_spec (gfc_array_spec *as, int ch
>>if (as == NULL)
>>  return true;
>>  
>> +  if (as->resolved)
>> +return true;
>> +
> Why this?

Because you get regressions otherwise.  Not resolving an array spec
twice should do no harm, and resolving it twice does so - I hit the
error message in check_restricted.  I'm not sure what is wrong, maybe
PR 23466 was not fully fixed, but this works.

>
>> -static gfc_expr *create_var (gfc_expr *);
>> +static gfc_expr *create_var (gfc_expr *, const char *vname=NULL);
>> +static int optimize_matmul_assign (gfc_code **, int *, void *);
> The function doesn't really "optimize", so name it inline_matmul_assign
> instead.
> Same for the comments about "optimizing MATMUL".

Done.

> 
>> @@ -524,29 +542,11 @@ constant_string_length (gfc_expr *e)
>>  
>>  }
>>  
>> -/* Returns a new expression (a variable) to be used in place of the old one,
>> -   with an assignment statement before the current statement to set
>> -   the value of the variable. Creates a new BLOCK for the statement if
>> -   that hasn't already been done and puts the statement, plus the
>> -   newly created variables, in that block.  Special cases:  If the
>> -   expression is constant or a temporary which has already
>> -   been created, just copy it.  */
>> -
>> -static gfc_expr*
>> -create_var (gfc_expr * e)
> Keep a comment here.

Still exists, further down.

>> +static gfc_namespace*
>> +insert_block ()
>>  {
>> -  char name[GFC_MAX_SYMBOL_LEN +1];
>> -  static int num = 1;
>> -  gfc_symtree *symtree;
>> -  gfc_symbol *symbol;
>> -  gfc_expr *result;
>> -  gfc_code *n;
>>gfc_namespace *ns;
>> -  int i;
>>  
>> -  if (e->expr_type == EXPR_CONSTANT || is_fe_temp (e))
>> -return gfc_copy_expr (e);
>> -
>>/* If the block hasn't already been created, do so.  */
>>if (inserted_block == NULL)
>>  {
> 
>> @@ -1939,7 +1977,1049 @@ doloop_warn (gfc_namespace *ns)
>>gfc_code_walker (&ns->code, doloop_code, do_function, NULL);
>>  }
>>  
>> +/* This selction deals with inlining calls to MATMUL.  */
> section
>>  
>> +/* Auxiliary function to build and simplify an array inquiry function.
>> +   dim is zero-based.  */
>> +
>> +static gfc_expr *
>> +get_array_inq_function (gfc_expr *e, int dim, gfc_isym_id id)
> It's better if the id is the first argument, so that the function id and
> its arguments come in their natural order.

Changed.

> [...]
> 
>> +/* Builds a logical expression.  */
>> +
>> +static gfc_expr*
>> +build_logical_expr (gfc_expr *e1, gfc_expr *e2, gfc_intrinsic_op op)
> Same here, op first.

Also changed.

> [...]
> 
>> +
>> +/* Return an operation of one two gfc_expr (one if e2 is NULL). This assumes
>> +   compatible typespecs.  */
>> +
>> +static gfc_expr *
>> +get_operand (gfc_intrinsic_op op, gfc_expr *e1, gfc_expr *e2)
> Here it's good already. :-)

:-)

> [...]
> 
>> +/* Insert code to issue a runtime error if the expressions are not equal.  
>> */
>> +
>> +static gfc_code *
>> +runtime_error_ne (gfc_expr *e1, gfc_expr *e2, const char *msg)
>> +{
>> +  gfc_expr *cond;
>> +  gfc_code *if_1, *if_2;
>> +  gfc_code *c;
>> +  // const char *name;
> Any reason...
> 
>> +  gfc_actual_arglist *a1, *a2, *a3;
>> +
>> +  gcc_assert (e1->where.lb);
>> +  /* Build the call to runtime_error.  */
>> +  c = XCNEW (gfc_code);
>> +  c->op = EXEC_CALL;
>> +  c->loc = e1->where;
>> +  // name = gfc_get_string (PREFIX ("runtime_error"));
>> +  // c->resolved_sym = gfc_get_intrinsic_sub_symbol (name);
> ... to keep these?

Removed.


>> +  while (ref)
>> +{
>> +  if (ref->type == REF_ARRAY && ref->u.ar.type != AR_ELEMENT)
>> +break;
>> +
>> +  ref = ref->next;
>> +
>> +}
>> +  ar = &ref->u.ar;
> You can probably use gfc_find_array_ref here.

Changed.  There are a few other places that could also benefit
from gfc_find_array_ref (now I know it exists :-)

> [...]
> 
> 
>> +
>> +/* Function to return a scalarized expression. It is assumed that indices 
>> are
>> + zero based to make generation of DO loops easier.  A zero as index will
>> + access the first element along a dimension.  Single element references will
>> + be skipped.  A NULL as an expression will be replaced by a full reference.
>> + This assumes that the index loops have gfc_index_integer_kind, and that all
>> + references have been frozen.  */
>> +
>> +static gfc_expr*
>

[C PATCH] Make -Wno-shift-count-negative -Wno-shift-count-overflow work for const ints (PR c/65830)

2015-04-21 Thread Marek Polacek
A trivial patch to use OPT_* where they belong.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-21  Marek Polacek  

PR c/65830
* c-common.c (c_fully_fold_internal): Use OPT_Wshift_count_negative
and OPT_Wshift_count_overflow.

* c-c++-common/pr65830.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 7fe7fa6..64fc95f 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -1370,15 +1370,17 @@ c_fully_fold_internal (tree expr, bool in_init, bool 
*maybe_const_operands,
  && c_inhibit_evaluation_warnings == 0)
{
  if (tree_int_cst_sgn (op1) < 0)
-   warning_at (loc, 0, (code == LSHIFT_EXPR
-? G_("left shift count is negative")
-: G_("right shift count is negative")));
+   warning_at (loc, OPT_Wshift_count_negative,
+   (code == LSHIFT_EXPR
+? G_("left shift count is negative")
+: G_("right shift count is negative")));
  else if (compare_tree_int (op1,
 TYPE_PRECISION (TREE_TYPE (orig_op0)))
   >= 0)
-   warning_at (loc, 0, (code == LSHIFT_EXPR
-? G_("left shift count >= width of type")
-: G_("right shift count >= width of type")));
+   warning_at (loc, OPT_Wshift_count_overflow,
+   (code == LSHIFT_EXPR
+? G_("left shift count >= width of type")
+: G_("right shift count >= width of type")));
}
   if ((code == TRUNC_DIV_EXPR
   || code == CEIL_DIV_EXPR
diff --git gcc/testsuite/c-c++-common/pr65830.c 
gcc/testsuite/c-c++-common/pr65830.c
index e69de29..e115f18 100644
--- gcc/testsuite/c-c++-common/pr65830.c
+++ gcc/testsuite/c-c++-common/pr65830.c
@@ -0,0 +1,16 @@
+/* PR c/65830 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wno-shift-count-negative -Wno-shift-count-overflow" } */
+
+int
+foo (int x)
+{
+  const int a = sizeof (int) * __CHAR_BIT__;
+  const int b = -7;
+  int c = 0;
+  c += x << a; /* { dg-bogus "left shift count >= width of type" } */
+  c += x << b; /* { dg-bogus "left shift count is negative" } */
+  c += x >> a; /* { dg-bogus "right shift count >= width of type" } */
+  c += x >> b;  /* { dg-bogus "right shift count is negative" } */
+  return c;
+}

Marek


[PATCH 5/5] libcc1: 'set debug compile': Display absolute GCC driver filename

2015-04-21 Thread Jan Kratochvil
Hi,

with the patches so far after
(gdb) set debug compile 1
one would get:
searching for compiler matching regex 
^(x86_64|i.86)(-[^-]*)?-linux(-gnu)?-gcc$
found compiler x86_64-unknown-linux-gnu-gcc
But I believe it is more readable to see:
searching for compiler matching regex 
^(x86_64|i.86)(-[^-]*)?-linux(-gnu)?-gcc$
found compiler /usr/bin/x86_64-unknown-linux-gnu-gcc

I do not think the change will have functionality impact, although the filename
gets used even for executing the command.


Jan


libcc1/ChangeLog
2015-04-21  Jan Kratochvil  

* findcomp.cc: Include system.h.
(search_dir): Return absolute filename.
---
 libcc1/findcomp.cc |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libcc1/findcomp.cc b/libcc1/findcomp.cc
index f02b1df..5d49e29 100644
--- a/libcc1/findcomp.cc
+++ b/libcc1/findcomp.cc
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "libiberty.h"
 #include "xregex.h"
 #include "findcomp.hh"
+#include "system.h"
 
 class scanner
 {
@@ -68,7 +69,7 @@ search_dir (const regex_t ®exp, const std::string &dir, 
std::string *result)
 {
   if (regexec (®exp, filename, 0, NULL, 0) == 0)
{
- *result = filename;
+ *result = dir + DIR_SEPARATOR + filename;
  return true;
}
 }



[PATCH 4/5] libcc1: Add 'set compile-gcc'

2015-04-21 Thread Jan Kratochvil
as discussed in
How to use compile & execute function in GDB
https://sourceware.org/ml/gdb/2015-04/msg00026.html

GDB currently searches for /usr/bin/ARCH-OS-gcc and chooses one but one cannot
override which one.  GDB would provide new option 'set compile-gcc'.

This patch does not change the libcc1 API as it overloads the triplet_regexp
parameter of GCC's set_arguments according to:

+  if (access (triplet_regexp, X_OK) == 0)

GDB counterpart:
[PATCH 4/4] compile: Add 'set compile-gcc'
https://sourceware.org/ml/gdb-patches/2015-04/msg00808.html
Message-ID: <20150421213657.14147.60506.st...@host1.jankratochvil.net>


Jan


include/ChangeLog
2015-04-21  Jan Kratochvil  

* gcc-interface.h (enum gcc_base_api_version): Add comment to
GCC_FE_VERSION_1.
(struct gcc_base_vtable): Describe triplet_regexp parameter overload
for set_arguments.

libcc1/ChangeLog
2015-04-21  Jan Kratochvil  

* libcc1.cc (libcc1_set_arguments): Implement filenames for
triplet_regexp.
---
 include/gcc-interface.h |7 -
 libcc1/libcc1.cc|   62 +++
 2 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/include/gcc-interface.h b/include/gcc-interface.h
index dd9fd50..a15edf7 100644
--- a/include/gcc-interface.h
+++ b/include/gcc-interface.h
@@ -46,7 +46,9 @@ enum gcc_base_api_version
 {
   GCC_FE_VERSION_0 = 0,
 
-  /* Parameter verbose has been moved from compile to set_arguments.  */
+  /* Parameter verbose has been moved from compile to set_arguments.
+ Parameter triplet_regexp of set_arguments can be also gcc driver
+ executable.  */
   GCC_FE_VERSION_1 = 1,
 };
 
@@ -69,7 +71,8 @@ struct gcc_base_vtable
 
   /* Set the compiler's command-line options for the next compilation.
  TRIPLET_REGEXP is a regular expression that is used to match the
- configury triplet prefix to the compiler.
+ configury triplet prefix to the compiler; TRIPLET_REGEXP can be
+ also absolute filename  to the computer.
  The arguments are copied by GCC.  ARGV need not be
  NULL-terminated.  The arguments must be set separately for each
  compilation; that is, after a compile is requested, the
diff --git a/libcc1/libcc1.cc b/libcc1/libcc1.cc
index d36073d..e2718b0 100644
--- a/libcc1/libcc1.cc
+++ b/libcc1/libcc1.cc
@@ -322,38 +322,48 @@ libcc1_set_arguments (struct gcc_base_context *s,
 
   self->verbose = verbose != 0;
 
-  std::string rx = make_regexp (triplet_regexp, COMPILER_NAME);
-  // Simulate fnotice by fprintf.
-  if (self->verbose)
-fprintf (stderr, _("searching for compiler matching regex %s\n"),
-rx.c_str());
-  code = regcomp (&triplet, rx.c_str (), REG_EXTENDED | REG_NOSUB);
-  if (code != 0)
+  std::string compiler;
+  if (access (triplet_regexp, X_OK) == 0)
 {
-  size_t len = regerror (code, &triplet, NULL, 0);
-  char err[len];
+  compiler = triplet_regexp;
+  // Simulate fnotice by fprintf.
+  if (self->verbose)
+   fprintf (stderr, _("using explicit compiler filename %s\n"),
+compiler.c_str());
+}
+  else
+{
+  std::string rx = make_regexp (triplet_regexp, COMPILER_NAME);
+  if (self->verbose)
+   fprintf (stderr, _("searching for compiler matching regex %s\n"),
+rx.c_str());
+  code = regcomp (&triplet, rx.c_str (), REG_EXTENDED | REG_NOSUB);
+  if (code != 0)
+   {
+ size_t len = regerror (code, &triplet, NULL, 0);
+ char err[len];
 
-  regerror (code, &triplet, err, len);
+ regerror (code, &triplet, err, len);
 
-  return concat ("Could not compile regexp \"",
-rx.c_str (),
-"\": ",
-err,
-(char *) NULL);
-}
+ return concat ("Could not compile regexp \"",
+rx.c_str (),
+"\": ",
+err,
+(char *) NULL);
+   }
 
-  std::string compiler;
-  if (!find_compiler (triplet, &compiler))
-{
+  if (!find_compiler (triplet, &compiler))
+   {
+ regfree (&triplet);
+ return concat ("Could not find a compiler matching \"",
+rx.c_str (),
+"\"",
+(char *) NULL);
+   }
   regfree (&triplet);
-  return concat ("Could not find a compiler matching \"",
-rx.c_str (),
-"\"",
-(char *) NULL);
+  if (self->verbose)
+   fprintf (stderr, _("found compiler %s\n"), compiler.c_str());
 }
-  regfree (&triplet);
-  if (self->verbose)
-fprintf (stderr, _("found compiler %s\n"), compiler.c_str());
 
   self->args.push_back (compiler);
 



[PATCH 2/5] libcc1: Use libcc1.so.0->libcc1.so.1

2015-04-21 Thread Jan Kratochvil
Hi,

see [patch 1/5], particularly:
(3) Currently there is no backward or forward compatibility although there
could be one implemented.  Personally I think the 'compile' feature is
still in experimental stage so that it is OK to require last releases.
At least in Fedora we can keep GDB<->GCC in sync.

GDB counterpart:
[PATCH 2/4] compile: Use libcc1.so.0->libcc1.so.1
https://sourceware.org/ml/gdb-patches/2015-04/msg00806.html
Message-ID: <20150421213642.14147.93210.st...@host1.jankratochvil.net>


Jan


include/ChangeLog
2015-04-21  Jan Kratochvil  

* gcc-c-interface.h (GCC_C_FE_LIBCC): Update it to GCC_FE_VERSION_1.
* gcc-interface.h (enum gcc_base_api_version): Add GCC_FE_VERSION_1.

libcc1/ChangeLog
2015-04-21  Jan Kratochvil  

* Makefile.am (libcc1_la_LDFLAGS): Add version-info 1.
* Makefile.in: Regenerate.
* libcc1.cc (vtable, gcc_c_fe_context): Update it to GCC_FE_VERSION_1.
---
 include/gcc-c-interface.h |2 +-
 include/gcc-interface.h   |3 ++-
 libcc1/Makefile.am|3 ++-
 libcc1/Makefile.in|4 +++-
 libcc1/libcc1.cc  |4 ++--
 5 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/include/gcc-c-interface.h b/include/gcc-c-interface.h
index 1b73e32..285c9c7 100644
--- a/include/gcc-c-interface.h
+++ b/include/gcc-c-interface.h
@@ -197,7 +197,7 @@ struct gcc_c_context
 /* The name of the .so that the compiler builds.  We dlopen this
later.  */
 
-#define GCC_C_FE_LIBCC "libcc1.so." STRINGIFY (GCC_FE_VERSION_0)
+#define GCC_C_FE_LIBCC "libcc1.so." STRINGIFY (GCC_FE_VERSION_1)
 
 /* The compiler exports a single initialization function.  This macro
holds its name as a symbol.  */
diff --git a/include/gcc-interface.h b/include/gcc-interface.h
index 34010f2..dcfa6ce 100644
--- a/include/gcc-interface.h
+++ b/include/gcc-interface.h
@@ -44,7 +44,8 @@ struct gcc_base_context;
 
 enum gcc_base_api_version
 {
-  GCC_FE_VERSION_0 = 0
+  GCC_FE_VERSION_0 = 0,
+  GCC_FE_VERSION_1 = 1,
 };
 
 /* The operations defined by the GCC base API.  This is the vtable for
diff --git a/libcc1/Makefile.am b/libcc1/Makefile.am
index 7a274b3..e6a94e2 100644
--- a/libcc1/Makefile.am
+++ b/libcc1/Makefile.am
@@ -63,7 +63,8 @@ libcc1plugin_la_LINK = $(LIBTOOL) --tag=CXX 
$(AM_LIBTOOLFLAGS) \
$(CXXFLAGS) $(libcc1plugin_la_LDFLAGS) $(LTLDFLAGS) -o $@
 
 LTLDFLAGS = $(shell $(SHELL) $(top_srcdir)/../libtool-ldflags $(LDFLAGS))
-libcc1_la_LDFLAGS = -module -export-symbols $(srcdir)/libcc1.sym
+libcc1_la_LDFLAGS = -module -export-symbols $(srcdir)/libcc1.sym \
+   -version-info 1:0:0
 libcc1_la_SOURCES = findcomp.cc libcc1.cc names.cc names.hh $(shared_source)
 libcc1_la_LIBADD = $(libiberty)
 libcc1_la_DEPENDENCIES = $(libiberty_dep)
diff --git a/libcc1/Makefile.in b/libcc1/Makefile.in
index 1916134..ebec54c 100644
--- a/libcc1/Makefile.in
+++ b/libcc1/Makefile.in
@@ -279,7 +279,9 @@ libcc1plugin_la_LINK = $(LIBTOOL) --tag=CXX 
$(AM_LIBTOOLFLAGS) \
$(CXXFLAGS) $(libcc1plugin_la_LDFLAGS) $(LTLDFLAGS) -o $@
 
 LTLDFLAGS = $(shell $(SHELL) $(top_srcdir)/../libtool-ldflags $(LDFLAGS))
-libcc1_la_LDFLAGS = -module -export-symbols $(srcdir)/libcc1.sym
+libcc1_la_LDFLAGS = -module -export-symbols $(srcdir)/libcc1.sym \
+   -version-info 1:0:0
+
 libcc1_la_SOURCES = findcomp.cc libcc1.cc names.cc names.hh $(shared_source)
 libcc1_la_LIBADD = $(libiberty)
 libcc1_la_DEPENDENCIES = $(libiberty_dep)
diff --git a/libcc1/libcc1.cc b/libcc1/libcc1.cc
index 7d7d2c1..afda023 100644
--- a/libcc1/libcc1.cc
+++ b/libcc1/libcc1.cc
@@ -504,7 +504,7 @@ libcc1_destroy (struct gcc_base_context *s)
 
 static const struct gcc_base_vtable vtable =
 {
-  GCC_FE_VERSION_0,
+  GCC_FE_VERSION_1,
   libcc1_set_arguments,
   libcc1_set_source_file,
   libcc1_set_print_callback,
@@ -523,7 +523,7 @@ struct gcc_c_context *
 gcc_c_fe_context (enum gcc_base_api_version base_version,
  enum gcc_c_api_version c_version)
 {
-  if (base_version != GCC_FE_VERSION_0 || c_version != GCC_C_FE_VERSION_0)
+  if (base_version != GCC_FE_VERSION_1 || c_version != GCC_C_FE_VERSION_0)
 return NULL;
 
   return new libcc1 (&vtable, &c_vtable);



[PATCH 3/5] libcc1: set debug compile: Display GCC driver filename

2015-04-21 Thread Jan Kratochvil
Hi,

as discussed in
How to use compile & execute function in GDB
https://sourceware.org/ml/gdb/2015-04/msg00026.html

GDB currently searches for /usr/bin/ARCH-OS-gcc and chooses one but it does not
display which one.  It cannot, GCC method set_arguments() does not yet know
whether 'set debug compile' is enabled or not.

Unfortunately this changes libcc1 API in an incompatible way.  There is
a possibility of a hack to keep the API the same - one could pass "-v" option
explicitly to set_arguments(), set_arguments() could compare the "-v" string
and print the GCC filename accordingly.  Then the 'verbose' parameter of
compile() would lose its meaning.  What do you think?

GDB counterpart:
[PATCH 3/4] compile: set debug compile: Display GCC driver filename
https://sourceware.org/ml/gdb-patches/2015-04/msg00807.html
Message-ID: <20150421213649.14147.79719.st...@host1.jankratochvil.net>


Jan


include/ChangeLog
2015-04-21  Jan Kratochvil  

* gcc-interface.h (enum gcc_base_api_version): Add comment to
GCC_FE_VERSION_1.
(struct gcc_base_vtable): Move parameter verbose from compile to
set_arguments.

libcc1/ChangeLog
2015-04-21  Jan Kratochvil  

* libcc1.cc: Include intl.h.
(struct libcc1): Add field verbose.
(libcc1::libcc1): Initialize it.
(libcc1_set_arguments): Add parameter verbose, implement it.
(libcc1_compile): Remove parameter verbose, use self's field instead.
---
 include/gcc-interface.h |   14 +++---
 libcc1/libcc1.cc|   22 +-
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/gcc-interface.h b/include/gcc-interface.h
index dcfa6ce..dd9fd50 100644
--- a/include/gcc-interface.h
+++ b/include/gcc-interface.h
@@ -45,6 +45,8 @@ struct gcc_base_context;
 enum gcc_base_api_version
 {
   GCC_FE_VERSION_0 = 0,
+
+  /* Parameter verbose has been moved from compile to set_arguments.  */
   GCC_FE_VERSION_1 = 1,
 };
 
@@ -71,14 +73,15 @@ struct gcc_base_vtable
  The arguments are copied by GCC.  ARGV need not be
  NULL-terminated.  The arguments must be set separately for each
  compilation; that is, after a compile is requested, the
- previously-set arguments cannot be reused.
+ previously-set arguments cannot be reused.  VERBOSE can be set
+ to cause GCC to print some information as it works.  
 
  This returns NULL on success.  On failure, returns a malloc()d
  error message.  The caller is responsible for freeing it.  */
 
   char *(*set_arguments) (struct gcc_base_context *self,
  const char *triplet_regexp,
- int argc, char **argv);
+ int argc, char **argv, int /* bool */ verbose);
 
   /* Set the file name of the program to compile.  The string is
  copied by the method implementation, but the caller must
@@ -95,13 +98,10 @@ struct gcc_base_vtable
  void *datum);
 
   /* Perform the compilation.  FILENAME is the name of the resulting
- object file.  VERBOSE can be set to cause GCC to print some
- information as it works.  Returns true on success, false on
- error.  */
+ object file.  Returns true on success, false on error.  */
 
   int /* bool */ (*compile) (struct gcc_base_context *self,
-const char *filename,
-int /* bool */ verbose);
+const char *filename);
 
   /* Destroy this object.  */
 
diff --git a/libcc1/libcc1.cc b/libcc1/libcc1.cc
index afda023..d36073d 100644
--- a/libcc1/libcc1.cc
+++ b/libcc1/libcc1.cc
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "xregex.h"
 #include "findcomp.hh"
 #include "compiler-name.h"
+#include "intl.h"
 
 struct libcc1;
 
@@ -66,6 +67,9 @@ struct libcc1 : public gcc_c_context
 
   std::vector args;
   std::string source_file;
+
+  /* Non-zero as an equivalent to gcc driver option "-v".  */
+  bool verbose;
 };
 
 // A local subclass of connection that holds a back-pointer to the
@@ -97,7 +101,8 @@ libcc1::libcc1 (const gcc_base_vtable *v,
 print_function (NULL),
 print_datum (NULL),
 args (),
-source_file ()
+source_file (),
+verbose (false)
 {
   base.ops = v;
   c_ops = cv;
@@ -309,13 +314,19 @@ make_regexp (const char *triplet_regexp, const char 
*compiler)
 static char *
 libcc1_set_arguments (struct gcc_base_context *s,
  const char *triplet_regexp,
- int argc, char **argv)
+ int argc, char **argv, int verbose)
 {
   libcc1 *self = (libcc1 *) s;
   regex_t triplet;
   int code;
 
+  self->verbose = verbose != 0;
+
   std::string rx = make_regexp (triplet_regexp, COMPILER_NAME);
+  // Simulate fnotice by fprintf.
+  if (self->verbose)
+fprintf (stderr, _("searching for compiler matching regex %s\n"),
+rx.c_str());
   code = regco

[PATCH 1/5] libcc1: Make libcc1.so->libcc1.so.0

2015-04-21 Thread Jan Kratochvil
Hi,

the next [patch 3/5] will change the libcc1.so API.  I am not sure if the API
change gets approved that way but for such case:
(1) We really need to change GCC_FE_VERSION_0 -> GCC_FE_VERSION_1, this
feature is there for this purpose.  That is [patch 2/5].
(2) Currently GDB does only dlopen("libcc1.so") and then depending on which
libcc1.so version it would find first it would succeed/fail.
I guess it is more convenient to do dlopen("libcc1.so.1") instead
(where ".1"=".x" corresponds to GCC_FE_VERSION_x).
That is this patch (with x=0).
GCC_C_FE_LIBCC is used only by GDB.
(3) Currently there is no backward or forward compatibility although there
could be one implemented.  Personally I think the 'compile' feature is
still in experimental stage so that it is OK to require last releases.
At least in Fedora we can keep GDB<->GCC in sync.

GDB counterpart:
[PATCH 1/4] compile: Use libcc1.so->libcc1.so.0
https://sourceware.org/ml/gdb-patches/2015-04/msg00805.html
Message-ID: <20150421213635.14147.15653.st...@host1.jankratochvil.net>


Jan


include/ChangeLog
2015-04-21  Jan Kratochvil  

* gcc-c-interface.h (GCC_C_FE_LIBCC): Quote it.  Append
GCC_FE_VERSION_0.
---
 include/gcc-c-interface.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/gcc-c-interface.h b/include/gcc-c-interface.h
index 25ef62f..1b73e32 100644
--- a/include/gcc-c-interface.h
+++ b/include/gcc-c-interface.h
@@ -197,7 +197,7 @@ struct gcc_c_context
 /* The name of the .so that the compiler builds.  We dlopen this
later.  */
 
-#define GCC_C_FE_LIBCC libcc1.so
+#define GCC_C_FE_LIBCC "libcc1.so." STRINGIFY (GCC_FE_VERSION_0)
 
 /* The compiler exports a single initialization function.  This macro
holds its name as a symbol.  */



Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state

2015-04-21 Thread Peter Bergner
On Fri, 2015-03-20 at 17:41 -0500, Peter Bergner wrote:
> On Fri, 2015-03-20 at 15:52 -0500, Segher Boessenkool wrote:
> > Maybe it would be nicer if the builtin-expansion code handled the copy
> > from cc, instead of stacking on RTL expanders.
> 
> That would allow getting rid of the expanders completely, which
> would be nice.  I'd have to somehow add some type of RS6000_BTC_*
> flag onto the builtin though, so I can tell during builtin expansion
> whether I need to emit the cr copy code or not.

Ok, the patch below implements your suggestion.


> > Expanders have no constraints (you can leave out the field completely).
> > Doesn't gen* warn on non-empty constraints?
> 
> Correct, and David mentioned this when I first submitted the original
> HTM patch, but I replied they were added to allow better error
> messages when people used out of range integers for builtin args:

This is a moot point now that the expanders are gone.


> > > --- gcc/testsuite/gcc.target/powerpc/htm-1.c  (revision 0)
> > > +++ gcc/testsuite/gcc.target/powerpc/htm-1.c  (working copy)
> > > @@ -0,0 +1,53 @@
> > > +/* { dg-do run { target { powerpc*-*-* && htm_hw } } } */
> > > +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
> > 
> > htm_hw already disallows Darwin?  [ And {"*"} {""} is default. ]

Fixed.


This patch also fixes some issues I hit with the tabortdc[i] and
htm_m[ft]spr_ patterns when used with -m32 -mpowerpc64.
This passed bootstrap and regtesting with no regressions, so
how does this look for stage1?

I'd also like to backport this to the open release branches so
everything matches (obviously waiting until after 5.1).
Is that ok once I've verified bootstrap and regtesting on
each release branch?

Peter


gcc/

PR target/64579
* config/rs6000/htm.md: Remove all define_expands.
(tabort_internal, tabortdc_internal, tabortdci_internal,
tabortwc_internal, tabortwci_internal, tbegin_internal,
tcheck_internal, tend_internal, trechkpt_internal,
treclaim_internal, tsr_internal): Rename define_insns from this...
(tabort, tabortdc, tabortdci, tabortwc, tabortwci, tbegin,
tcheck, tend, trechkpt, treclaim, tsr): ...to this.
(tabort): Use gpc_reg_operand.
(tabortdc, tabortdci): Match DImode registers.
Add TARGET_POWERPC64 constraint.
(tcheck_internal): Remove operand.
(htm_mfspr_, htm_mtspr_): Use GPR mode macro.
* config/rs6000/htmxlintrin.h (__TM_end): Use _HTM_TRANSACTIONAL as
expected value.
* config/rs6000/rs6000-builtin.def (BU_HTM_SPR0): Remove.
(BU_HTM_SPR1): Rename to BU_HTM_V1.  Remove use of RS6000_BTC_SPR.
(tabort, tabortdc, tabortdci, tabortwc, tabortwci, tbegin,
tcheck, tend, tendall, trechkpt, treclaim, tresume, tsuspend,
tsr, ttest): Pass in the RS6000_BTC_CR attribute.
(get_tfhar, set_tfhar, get_tfiar, set_tfiar, get_texasr, set_texasr,
get_texasru, set_texasru): Pass in the RS6000_BTC_SPR attribute.
(tcheck): Remove builtin argument.
(ttest): Update pattern name.
* config/rs6000/rs6000.c (rs6000_htm_spr_icode): Use TARGET_POWERPC64
not TARGET_64BIT.
(htm_expand_builtin): Fix usage of expandedp.  Disallow usage of the
tabortdc and tabortdci builtins when not in 64-bit mode.
Modify code to handle the loss of the HTM define_expands.
Emit code to copy the CR register to TARGET.
(htm_init_builtins): Modify code to handle the loss of the HTM
define_expands.
* config/rs6000/rs6000.h (RS6000_BTC_32BIT): Delete.
(RS6000_BTC_64BIT): Likewise.
(RS6000_BTC_CR): New macro.
* doc/extend.texi: Update documentation for htm builtins.

gcc/testsuite/

PR target/64579
* gcc.target/powerpc/htm-1.c: New test.
* gcc.target/powerpc/htm-builtin-1.c (__builtin_tabortdc): Only test
on 64-bit compiles.
(__builtin_tabortdci): Likewise.
(__builtin_tcheck): Remove operand.
* lib/target-supports.exp (check_htm_hw_available): New function.

Index: gcc/config/rs6000/htm.md
===
--- gcc/config/rs6000/htm.md(revision 222127)
+++ gcc/config/rs6000/htm.md(working copy)
@@ -47,108 +47,38 @@ (define_c_enum "unspecv"
   ])
 
 
-(define_expand "tabort"
-  [(set (match_dup 2)
-   (unspec_volatile:CC [(match_operand:SI 1 "int_reg_operand" "")]
-   UNSPECV_HTM_TABORT))
-   (set (match_dup 3)
-   (eq:SI (match_dup 2)
-  (const_int 0)))
-   (set (match_operand:SI 0 "int_reg_operand" "")
-   (xor:SI (match_dup 3)
-   (const_int 1)))]
-  "TARGET_HTM"
-{
-  operands[2] = gen_rtx_REG (CCmode, CR0_REGNO);
-  operands[3] = gen_reg_rtx (SImode);
-})
-
-(define_insn "*tabort_internal"
+(define_insn "tabort"
   [(set (match_operand:CC 1 "cc_reg_operand" "=x")
-   (unspec_vola

Handle oacc kernels with other directives (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries  wrote:
> I'm submitting a patch series with initial support for the oacc kernels 
> directive.

Committed to gomp-4_0-branch in r88:

commit 7109b39defb87bc839983339c9fb4cdcb3891238
Author: tschwinge 
Date:   Tue Apr 21 20:32:01 2015 +

Handle oacc kernels with other directives

Mark directives with fn spec attributes to prevent them from acting as
optimization barrier.

gcc/
* builtin-attrs.def (DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
(ATTR_FNSPEC_DOT_DOT_r_r_r_NOTHROW_LIST): Add DEF_ATTR_TREE_LIST.
* omp-builtins.def (BUILT_IN_GOACC_DATA_START)
(BUILT_IN_GOACC_ENTER_EXIT_DATA, BUILT_IN_GOACC_UPDATE): Use
DEF_GOACC_BUILTIN_FNSPEC instead of DEF_GOACC_BUILTIN.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-data-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: New test.
* c-c++-common/goacc/kernels-loop-data-update.c: New test.
* c-c++-common/goacc/kernels-loop-data.c: New test.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: New
test.
* gfortran.dg/goacc/kernels-loop-data-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-update.f95: New test.
* gfortran.dg/goacc/kernels-loop-data.f95: New test.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: New
test.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-2.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit-2.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-update.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data.c: New
test.
* 
testsuite/libgomp.oacc-c-c++-common/kernels-parallel-loop-data-enter-exit.c:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95: New
test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95: New
test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: New test.
* 
testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95:
New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@88 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |6 ++
 gcc/builtin-attrs.def  |3 +
 gcc/omp-builtins.def   |   21 +++---
 gcc/testsuite/ChangeLog.gomp   |   15 +
 .../c-c++-common/goacc/kernels-loop-data-2.c   |   71 
 .../goacc/kernels-loop-data-enter-exit-2.c |   69 +++
 .../goacc/kernels-loop-data-enter-exit.c   |   66 ++
 .../c-c++-common/goacc/kernels-loop-data-update.c  |   66 ++
 .../c-c++-common/goacc/kernels-loop-data.c |   65 ++
 .../goacc/kernels-parallel-loop-data-enter-exit.c  |   67 ++
 .../gfortran.dg/goacc/kernels-loop-data-2.f95  |   52 ++
 .../goacc/kernels-loop-data-enter-exit-2.f95   |   52 ++
 .../goacc/kernels-loop-data-enter-exit.f95 |   50 ++
 .../gfortran.dg/goacc/kernels-loop-data-update.f95 |   49 ++
 .../gfortran.dg/goacc/kernels-loop-data.f95|   50 ++
 .../kernels-parallel-loop-data-enter-exit.f95  |   51 ++
 libgomp/ChangeLog.gomp |   24 +++
 .../kernels-loop-data-2.c  |   56 +++
 .../kernels-loop-data-enter-exit-2.c   |   54 +++
 .../kernels-loop-data-enter-exit.c |   51 ++
 .../kernels-loop-data-update.c |   53 +++
 .../libgomp.oacc-c-c++-common/kernels-loop-data.c  |   50 ++
 .../kernels-parallel-loop-data-enter-exit.c|   52 ++
 .../libgomp.oacc-fortran/kernels-loop-data-2.f95   |   38 +++
 .../kernels-loop-data-enter-exit-2.f95 |   38 +++
 .../kernels-loop-data-enter-exit.f95   |   36 ++
 .../kernels-loop-data-update.f95   |   36 ++
 .../libgomp.oacc-fortran/kernels-loop-data.f95 |   36 ++
 .../kernels-parallel-loop-data-enter-

Handle global loop counters in c/c++ oacc kernels (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries  wrote:
> I'm submitting a patch series with initial support for the oacc kernels 
> directive.

Committed to gomp-4_0-branch in r87:

commit abaf92b2db3c0799edac63cfb846af2dbde47423
Author: tschwinge 
Date:   Tue Apr 21 20:27:40 2015 +

Handle global loop counters in c/c++ oacc kernels

gcc/
* passes.def: Add pass_fre after pass_ch_oacc_kernels.

gcc/testsuite/
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
* c-c++-common/goacc/kernels-one-counter-var.c: New test.
* g++.dg/ipa/devirt-37.C: Update for new pass_fre.
* g++.dg/ipa/devirt-40.C: Likewise.
* g++.dg/tree-ssa/pr61034.C: Likewise.
* gcc.dg/ipa/ipa-pta-13.c: Likewise.
* gcc.dg/ipa/ipa-pta-3.c: Likewise.
* gcc.dg/ipa/ipa-pta-4.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@87 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |2 +
 gcc/passes.def |1 +
 gcc/testsuite/ChangeLog.gomp   |9 
 .../goacc/kernels-counter-vars-function-scope.c|   55 
 .../c-c++-common/goacc/kernels-one-counter-var.c   |   54 +++
 gcc/testsuite/g++.dg/ipa/devirt-37.C   |   12 ++---
 gcc/testsuite/g++.dg/ipa/devirt-40.C   |6 +--
 gcc/testsuite/g++.dg/tree-ssa/pr61034.C|   10 ++--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c  |6 +--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c   |6 +--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c   |6 +--
 11 files changed, 144 insertions(+), 23 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index f14c3718..b1933ba 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,7 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass_fre after pass_ch_oacc_kernels.
+
* passes.def: Add pass_scev_cprop to pass_oacc_kernels.
* tree-ssa-loop.c (pass_scev_cprop::clone): New function.
 
diff --git gcc/passes.def gcc/passes.def
index 3e85808..04cbba0 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_fre);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index eed22e2..ed80f5b 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,6 +1,15 @@
 2015-04-21  Tom de Vries  
Thomas Schwinge  
 
+   * c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
+   * c-c++-common/goacc/kernels-one-counter-var.c: New test.
+   * g++.dg/ipa/devirt-37.C: Update for new pass_fre.
+   * g++.dg/ipa/devirt-40.C: Likewise.
+   * g++.dg/tree-ssa/pr61034.C: Likewise.
+   * gcc.dg/ipa/ipa-pta-13.c: Likewise.
+   * gcc.dg/ipa/ipa-pta-3.c: Likewise.
+   * gcc.dg/ipa/ipa-pta-4.c: Likewise.
+
* gcc.dg/pr41488.c: Update for new pass_scev_cprop.
* gcc.dg/tree-ssa/loop-17.c: Likewise.
* gcc.dg/tree-ssa/loop-39.c: Likewise.
diff --git 
gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c 
gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
new file mode 100644
index 000..06cdb29
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
@@ -0,0 +1,55 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+/* { dg-additional-options "-fdump-tree-parloops_oacc_kernels-all" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+#include 
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+int
+main (void)
+{
+  unsigned int *__restrict a;
+  unsigned int *__restrict b;
+  unsigned int *__restrict c;
+  COUNTERTYPE i;
+  COUNTERTYPE ii;
+
+  a = (unsigned int *)malloc (N * sizeof (unsigned int));
+  b = (unsigned int *)malloc (N * sizeof (unsigned int));
+  c = (unsigned int *)malloc (N * sizeof (unsigned int));
+
+  for (i = 0; i < N; i++)
+a[i] = i * 2;
+
+  for (i = 0; i < N; i++)
+b[i] = i * 4;
+
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  {
+for (ii = 0; ii < N; ii++)
+  c[ii] = a[ii] + b[ii];
+  }
+
+  for (i = 0; i < N; i++)
+if (c[i] != a[i] + b[i])
+  abort ();
+
+  free (a);
+  free (b);
+  free (c);
+
+  return 0;
+}
+
+/* Check that only one loop is analyzed, and that it can be parallelized.  */
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 
"parloops_oacc_kernels" } } */
+/* { dg-final { scan-tree-dump-

Handle global loop counters in fortran oacc kernels (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries  wrote:
> I'm submitting a patch series with initial support for the oacc kernels 
> directive.

Committed to gomp-4_0-branch in r86:

commit 0c33234340aa17536c2c86e0982c42070c89226b
Author: tschwinge 
Date:   Tue Apr 21 20:22:54 2015 +

Handle global loop counters in fortran oacc kernels

Unable to have loop counters with a scope limited to the kernels region, and
the fact that function scope inhibits parallelization, at the technical 
level,
it seems possible to do DCE and get rid of the dead code that is inhibiting
parallelization (in other words, the code copying the loop iterator value 
out
of the region), but probably some effort would be involved.

Another possibility is to add an assign of the final value of the loop
iteration variable after the loop to cut the dependency, though this will 
only
work for loops where that value is know at compile time -- which is exactly
what pass_scev_cprop does.

gcc/
* passes.def: Add pass_scev_cprop to pass_oacc_kernels.
* tree-ssa-loop.c (pass_scev_cprop::clone): New function.

gcc/testsuite/
* gcc.dg/pr41488.c: Update for new pass_scev_cprop.
* gcc.dg/tree-ssa/loop-17.c: Likewise.
* gcc.dg/tree-ssa/loop-39.c: Likewise.
* gcc.dg/tree-ssa/scev-7.c: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: New test.
* gfortran.dg/goacc/kernels-loop.f95: New test.

libgomp/
* testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: New test.
* testsuite/libgomp.oacc-fortran/kernels-loop.f95: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@86 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |3 ++
 gcc/passes.def |1 +
 gcc/testsuite/ChangeLog.gomp   |7 +++
 gcc/testsuite/gcc.dg/pr41488.c |6 +--
 gcc/testsuite/gcc.dg/tree-ssa/loop-17.c|6 +--
 gcc/testsuite/gcc.dg/tree-ssa/loop-39.c|6 +--
 gcc/testsuite/gcc.dg/tree-ssa/scev-7.c |6 +--
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 |   46 
 gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   |   40 +
 gcc/tree-ssa-loop.c|1 +
 libgomp/ChangeLog.gomp |3 ++
 .../libgomp.oacc-fortran/kernels-loop-2.f95|   32 ++
 .../libgomp.oacc-fortran/kernels-loop.f95  |   28 
 13 files changed, 173 insertions(+), 12 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index bf0ee52..f14c3718 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass_scev_cprop to pass_oacc_kernels.
+   * tree-ssa-loop.c (pass_scev_cprop::clone): New function.
+
* passes.def: Add pass_parallelize_loops_oacc_kernels in pass group
pass_oacc_kernels.
* tree-parloops.c (create_parallel_loop, gen_parallel_loop): Add
diff --git gcc/passes.def gcc/passes.def
index 2d2e286..3e85808 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
+ NEXT_PASS (pass_scev_cprop);
  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
  NEXT_PASS (pass_expand_omp_ssa);
  NEXT_PASS (pass_tree_loop_done);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 2c6abff..eed22e2 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,6 +1,13 @@
 2015-04-21  Tom de Vries  
Thomas Schwinge  
 
+   * gcc.dg/pr41488.c: Update for new pass_scev_cprop.
+   * gcc.dg/tree-ssa/loop-17.c: Likewise.
+   * gcc.dg/tree-ssa/loop-39.c: Likewise.
+   * gcc.dg/tree-ssa/scev-7.c: Likewise.
+   * gfortran.dg/goacc/kernels-loop-2.f95: New test.
+   * gfortran.dg/goacc/kernels-loop.f95: New test.
+
* c-c++-common/goacc/kernels-loop-2.c: New test.
* c-c++-common/goacc/kernels-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n.c: New test.
diff --git gcc/testsuite/gcc.dg/pr41488.c gcc/testsuite/gcc.dg/pr41488.c
index c4bc428..1f306b4 100644
--- gcc/testsuite/gcc.dg/pr41488.c
+++ gcc/testsuite/gcc.dg/pr41488.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-sccp-scev" } */
+/* { dg-options "-O2 -fdump-tree-sccp2-scev" } */
 
 struct struct_t
 {
@@ -14,5 +14,5 @@ void foo (struct struct_t* sp, int start, int end)
 sp->data[i+start] = 0;
 }
 
-/* { dg-final { scan-tree-dump-times "Simplify PEELED_CHREC into 
POLYNOMIAL_CHREC" 1 "sccp" } } */
-/* { dg-final { cleanup-

RE: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo

2015-04-21 Thread Evandro Menezes
Kyrill,

Here's what I get on an Exynos M1:


$ cat /proc/cpuinfo   <
Processor   : AArch64 Processor rev 0 (aarch64)
...
Features: fp asimd aes pmull sha1 sha2 crc32
CPU implementer : 0x53
CPU architecture: AArch64
CPU variant : 0x0
CPU part: 0x001
CPU revision: 0

Please, let me know if you need any help.

Thank you,

-- 
Evandro Menezes  Austin, TX

> -Original Message-
> From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com]
> Sent: Monday, April 20, 2015 10:48
> To: GCC Patches
> Cc: Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Evandro Menezes;
> Andrew Pinski; James Greenhalgh
> Subject: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only
> /proc/cpuinfo
> 
> Hi all,
> 
> This is an attempt to add native CPU detection to AArch64 GNU/Linux targets.
> Similar to other ports we use SPEC rewriting to rewrite -
> m{cpu,tune,arch}=native options into the appropriate CPU/architecture and the
> architecture extension options when appropriate (i.e. +crypto/+crc etc).
> 
> For CPU/architecture detection it gets a bit involved, especially when
> running on a big.LITTLE system. My proposed approach is to look at
> /proc/cpuinfo/ and search for the implementer id and part number fields that
> uniquely identify each core (appropriate identifying information is added to
> aarch64-cores.def). If we find two types of core we have a big.LITTLE system,
> so search through the core definitions extracted from aarch64-cores.def to
> find if we support such a combination (currently only cortex-a57.cortex-a53
> and cortex-a72.cortex-a53) and make sure that the implementer id field
> matches up.
> 
> I tested this on a 4xCortex-A53 + 2xCortex-A57 big.LITTLE Ubuntu GNU/Linux
> system.
> There are two formats for /proc/cpuinfo/ that I'm aware of. The first (old)
> one has the format:
> --
> processor: 0
> processor: 1
> processor: 2
> processor: 3
> processor: 4
> processor: 5
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer: 0x41
> CPU architecture: AArch64
> CPU variant: 0x0
> CPU part: 0xd03
> --
> 
> In this format it lists the 6 cores but the CPU part it reports is only the
> one for the core from which /proc/cpuinfo was read from (!), in this case one
> of the Cortex-A53 cores.
> This means we detect a different CPU depending on which core GCC was invoked
> on. Not ideal really, but there's no more information that we can extract.
> Given the /proc/cpuinfo above, this patch will rewrite -mcpu=native into -
> mcpu=cortex-a53+fp+simd+crypto+crc
> 
> The newer /proc/cpuinfo format proposed at
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=44
> b82b7700d05a52cd983799d3ecde1a976b3bed
> looks like this:
> 
> --
> processor   : 0
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd03
> CPU revision: 0
> 
> processor   : 1
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd03
> CPU revision: 0
> 
> processor   : 2
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd03
> CPU revision: 0
> 
> processor   : 3
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd03
> CPU revision: 0
> 
> processor   : 4
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd07
> CPU revision: 0
> 
> processor   : 5
> Features: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part: 0xd07
> CPU revision: 0
> --
> 
> The Features field is used to detect the architectural features that we map
> to GCC option extensions i.e. +fp,+crypto,+simd,+crc etc.
> 
> Similarly, -march=native would be rewritten into -march=armv8-
> a+fp+simd+crypto+crc while -mtune=native into -march=cortex-a57.cortex-a53
> (the arch extension options are not valid for -mtune).
> 
> If it detects more than one implementer ID or the implementer IDs not
> matching up somewhere or some other weirdness /proc/cpuinfo or fails to
> recognise the CPU it will bail out and ignore the option entirely (similarly
> to other ports).
> 
> The patch works fine with both /proc/cpuinfo formats although, as mentioned
> above, it will not be able to detect the

Re: [PATCH, 7/8] Add pass_parallelize_loops_oacc_kernels to pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:42:28 +0100, Tom de Vries  wrote:
> On 15-11-14 18:23, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds:
> > - a specialized version of pass_parallelize_loops called
> >  pass_parloops_oacc_kernels to pass group pass_oacc_kernels, and
> > - relevant test-cases.
> >
> > The pass only handles loops that are in a kernels region, and skips over 
> > bits of
> > pass_parallelize_loops that are already done for oacc kernels.
> >
> > The pass reintroduces the use of omp_expand_local, I haven't managed to 
> > make it
> > work yet using the external pass pass_expand_omp_ssa.
> >
> > An obvious limitation of the patch is the fact that we copy over the clauses
> > from the kernels directive to the generated parallel directive. We'll need 
> > to do
> > something more intelligent here, f.i. setting vector_length based on the
> > parallelization factor.
> >
> > Another limitation is that the pass still needs -ftree-parallelize-loops to
> > trigger.
> >
> 
> Updated for using pass_copyprop instead of pass_ccp in pass_oacc_kernels.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r85:

commit 74e09b9dbbe43321fb20b0174f926893bf2111bc
Author: tschwinge 
Date:   Tue Apr 21 20:06:16 2015 +

Add pass_parallelize_loops_oacc_kernels to pass_oacc_kernels

gcc/
* passes.def: Add pass_parallelize_loops_oacc_kernels in pass group
pass_oacc_kernels.
* tree-parloops.c (create_parallel_loop, gen_parallel_loop): Add
function parameters region_entry and bool oacc_kernels_p.  Handle
oacc_kernels_p.
Call create_parallel_loop with additional args.
(parallelize_loops): Add function parameter oacc_kernels_p.  Calculate
dominance info.  Skip loops that are not in a kernels region. Call
gen_parallel_loop with additional args.
(pass_parallelize_loops::execute): Call parallelize_loops with false
argument.
(pass_data_parallelize_loops_oacc_kernels): New pass_data.
(class pass_parallelize_loops_oacc_kernels): New pass.
(pass_parallelize_loops_oacc_kernels::execute)
(make_pass_parallelize_loops_oacc_kernels): New function.
* tree-pass.h (make_pass_parallelize_loops_oacc_kernels): Declare.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-2.c: New test.
* c-c++-common/goacc/kernels-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n.c: New test.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: New test.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c:
New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@85 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |   17 ++
 gcc/passes.def |1 +
 gcc/testsuite/ChangeLog.gomp   |5 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  |   62 +
 .../c-c++-common/goacc/kernels-loop-mod-not-zero.c |   53 
 gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  |   48 
 gcc/testsuite/c-c++-common/goacc/kernels-loop.c|   53 
 gcc/tree-parloops.c|  282 
 gcc/tree-pass.h|2 +
 libgomp/ChangeLog.gomp |9 +
 .../libgomp.oacc-c-c++-common/kernels-loop-2.c |   47 
 .../kernels-loop-mod-not-zero.c|   41 +++
 .../libgomp.oacc-c-c++-common/kernels-loop-n.c |   47 
 .../libgomp.oacc-c-c++-common/kernels-loop.c   |   41 +++
 14 files changed, 650 insertions(+), 58 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 0be9191..bf0ee52 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,22 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass_parallelize_loops_oacc_kern

Re: [PATCH, 6/8] Add pass_copy_prop in pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:38:55 +0100, Tom de Vries  wrote:
> On 15-11-14 18:22, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds pass_loop_ccp to pass group pass_oacc_kernels.
> >
> > We need this pass to simplify the loop body, and allow pass_parloops to 
> > detect
> > that loop iterations are independent.
> >
> 
> As suggested here ( https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02993.html 
> ) 
> I've replaced the pass_ccp with pass_copyprop, which performs trivial 
> constant 
> propagation in addition to copy propagation.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r84:

commit 1c2529b64620811cbff4a50374af797ee52ef5f8
Author: tschwinge 
Date:   Tue Apr 21 19:58:54 2015 +

Add pass_copy_prop in pass_oacc_kernels

gcc/
* passes.def: Add pass_copy_prop to pass group pass_oacc_kernels.
* tree-ssa-copy.c (stmt_may_generate_copy): Handle .omp_data_i init
conservatively.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@84 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp  |4 
 gcc/passes.def  |1 +
 gcc/tree-ssa-copy.c |4 
 3 files changed, 9 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 98e33ad..0be9191 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,9 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass_copy_prop to pass group pass_oacc_kernels.
+   * tree-ssa-copy.c (stmt_may_generate_copy): Handle .omp_data_i init
+   conservatively.
+
* passes.def: Add pass_lim in pass group pass_ch_oacc_kernels.
 
* passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass
diff --git gcc/passes.def gcc/passes.def
index e6c9287..e6f1c33 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -93,6 +93,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_ch_oacc_kernels);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
+ NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_expand_omp_ssa);
  NEXT_PASS (pass_tree_loop_done);
  POP_INSERT_PASSES ()
diff --git gcc/tree-ssa-copy.c gcc/tree-ssa-copy.c
index 5ae8e6c..6f35f99 100644
--- gcc/tree-ssa-copy.c
+++ gcc/tree-ssa-copy.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-ssa-dom.h"
 #include "tree-ssa-loop-niter.h"
+#include "omp-low.h"
 
 
 /* This file implements the copy propagation pass and provides a
@@ -116,6 +117,9 @@ stmt_may_generate_copy (gimple stmt)
   if (gimple_has_volatile_ops (stmt))
 return false;
 
+  if (gimple_stmt_omp_data_i_init_p (stmt))
+return false;
+
   /* Statements with loads and/or stores will never generate a useful copy.  */
   if (gimple_vuse (stmt))
 return false;


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH, 5/8] Add pass_lim to pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:30:52 +0100, Tom de Vries  wrote:
> On 15-11-14 18:22, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds pass_loop_im to pass group pass_oacc_kernels.
> >
> > We need this pass to simplify the loop body, and allow pass_parloops to 
> > detect
> > that loop iterations are independent.
> >
> 
> Updated for moving pass_oacc_kernels down past pass_fre in the pass list.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r83:

commit 79112043cabc81c3a283585c9a28b6a1ab3826df
Author: tschwinge 
Date:   Tue Apr 21 19:55:42 2015 +

Add pass_lim to pass_oacc_kernels

gcc/
* passes.def: Add pass_lim in pass group pass_ch_oacc_kernels.

gcc/testsuite/
* c-c++-common/restrict-2.c: Update for new pass_lim.
* c-c++-common/restrict-4.c: Same.
* g++.dg/tree-ssa/pr33615.C: Same.
* g++.dg/tree-ssa/restrict1.C: Same.
* gcc.dg/tm/pub-safety-1.c: Same.
* gcc.dg/tm/reg-promotion.c: Same.
* gcc.dg/tree-ssa/20050314-1.c: Same.
* gcc.dg/tree-ssa/loop-32.c: Same.
* gcc.dg/tree-ssa/loop-33.c: Same.
* gcc.dg/tree-ssa/loop-34.c: Same.
* gcc.dg/tree-ssa/loop-35.c: Same.
* gcc.dg/tree-ssa/loop-7.c: Same.
* gcc.dg/tree-ssa/pr23109.c: Same.
* gcc.dg/tree-ssa/restrict-3.c: Same.
* gcc.dg/tree-ssa/restrict-5.c: Same.
* gcc.dg/tree-ssa/ssa-lim-1.c: Same.
* gcc.dg/tree-ssa/ssa-lim-10.c: Same.
* gcc.dg/tree-ssa/ssa-lim-11.c: Same.
* gcc.dg/tree-ssa/ssa-lim-12.c: Same.
* gcc.dg/tree-ssa/ssa-lim-2.c: Same.
* gcc.dg/tree-ssa/ssa-lim-3.c: Same.
* gcc.dg/tree-ssa/ssa-lim-6.c: Same.
* gcc.dg/tree-ssa/ssa-lim-7.c: Same.
* gcc.dg/tree-ssa/ssa-lim-8.c: Same.
* gcc.dg/tree-ssa/ssa-lim-9.c: Same.
* gcc.dg/tree-ssa/structopt-1.c: Same.
* gfortran.dg/pr32921.f: Same.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@83 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp  |2 ++
 gcc/passes.def  |1 +
 gcc/testsuite/ChangeLog.gomp|   31 +++
 gcc/testsuite/c-c++-common/restrict-2.c |6 +++---
 gcc/testsuite/c-c++-common/restrict-4.c |6 +++---
 gcc/testsuite/g++.dg/tree-ssa/pr33615.C |6 +++---
 gcc/testsuite/g++.dg/tree-ssa/restrict1.C   |6 +++---
 gcc/testsuite/gcc.dg/tm/pub-safety-1.c  |6 +++---
 gcc/testsuite/gcc.dg/tm/reg-promotion.c |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/20050314-1.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-32.c |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-33.c |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-34.c |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c |8 +++
 gcc/testsuite/gcc.dg/tree-ssa/loop-7.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr23109.c |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/restrict-3.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/restrict-5.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-1.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-10.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-11.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-12.c  |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-2.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-3.c   |8 +++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-6.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-7.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-9.c   |6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/structopt-1.c |6 +++---
 gcc/testsuite/gfortran.dg/pr32921.f |6 +++---
 30 files changed, 117 insertions(+), 83 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1fb060f..98e33ad 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,7 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass_lim in pass group pass_ch_oacc_kernels.
+
* passes.def: Run pass_tree_loo

Re: [PATCH, 4/8] Add pass_tree_loop_{init,done} to pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:29:28 +0100, Tom de Vries  wrote:
> On 15-11-14 18:21, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds pass_tree_loop_init and pass_tree_loop_init_done to
> > pass_oacc_kernels.
> >
> > Pass_parallelize_loops is run between these passes in the pass group
> > pass_tree_loop, since it requires loop information.  We do the same for
> > pass_oacc_kernels.
> >
> 
> Updated for moving pass_oacc_kernels down past pass_fre in the pass list.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r82:

commit cb95b4a1efcdb96c58cda986d53b20c3537c1ab7
Author: tschwinge 
Date:   Tue Apr 21 19:51:33 2015 +

Add pass_tree_loop_{init,done} to pass_oacc_kernels

gcc/
* passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass
group pass_oacc_kernels.
* tree-ssa-loop.c (pass_tree_loop_init::clone)
(pass_tree_loop_done::clone): New function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@82 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp  |5 +
 gcc/passes.def  |2 ++
 gcc/tree-ssa-loop.c |2 ++
 3 files changed, 9 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index d00c5e0..1fb060f 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,10 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass
+   group pass_oacc_kernels.
+   * tree-ssa-loop.c (pass_tree_loop_init::clone)
+   (pass_tree_loop_done::clone): New function.
+
* omp-low.c (loop_in_oacc_kernels_region_p): New function.
* omp-low.h (loop_in_oacc_kernels_region_p): Declare.
* passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
diff --git gcc/passes.def gcc/passes.def
index 5cdbc87..83ae04e 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -91,7 +91,9 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_expand_omp_ssa);
+ NEXT_PASS (pass_tree_loop_done);
  POP_INSERT_PASSES ()
  NEXT_PASS (pass_merge_phi);
  NEXT_PASS (pass_cd_dce);
diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c
index a041858..2a96a39 100644
--- gcc/tree-ssa-loop.c
+++ gcc/tree-ssa-loop.c
@@ -272,6 +272,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *);
+  opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); }
 
 }; // class pass_tree_loop_init
 
@@ -566,6 +567,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *) { return tree_ssa_loop_done (); }
+  opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); }
 
 }; // class pass_tree_loop_done
 


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:27:34 +0100, Tom de Vries  wrote:
> On 15-11-14 18:21, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> Hi,
> >>
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds a pass_ch_oacc_kernels to the pass group pass_oacc_kernels.
> >
> > The idea is that pass_parallelize_loops only deals with loops for which the
> > header has been copied, so the easiest way to meet that requirement when 
> > running
> > pass_parallelize_loops in group pass_oacc_kernels, is to run pass_ch as a 
> > part
> > of pass_oacc_kernels.
> >
> > We define a seperate pass pass_ch_oacc_kernels, to leave all loops that 
> > aren't
> > part of a kernels region alone.
> >
> 
> Updated for moving pass_oacc_kernels down past pass_fre in the pass list.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r81:

commit 58c33a7965c379b55b549d50e3b79b2252bcc876
Author: tschwinge 
Date:   Tue Apr 21 19:48:16 2015 +

Add pass_ch_oacc_kernels to pass_oacc_kernels

gcc/
* omp-low.c (loop_in_oacc_kernels_region_p): New function.
* omp-low.h (loop_in_oacc_kernels_region_p): Declare.
* passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
* tree-pass.h (make_pass_ch_oacc_kernels): Declare
* tree-ssa-loop-ch.c: Include omp-low.h.
(pass_ch_execute): Declare.
(pass_ch::execute): Factor out ...
(pass_ch_execute): ... this new function.  If handling oacc kernels,
skip loops that are not in oacc kernels region.
(pass_ch_oacc_kernels::execute):
(pass_data_ch_oacc_kernels): New pass_data.
(class pass_ch_oacc_kernels): New pass.
(pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@81 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |   15 
 gcc/omp-low.c  |   91 
 gcc/omp-low.h  |2 ++
 gcc/passes.def |1 +
 gcc/tree-pass.h|1 +
 gcc/tree-ssa-loop-ch.c |   59 +--
 6 files changed, 167 insertions(+), 2 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 8a53ad8..d00c5e0 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,20 @@
 2015-04-21  Tom de Vries  
 
+   * omp-low.c (loop_in_oacc_kernels_region_p): New function.
+   * omp-low.h (loop_in_oacc_kernels_region_p): Declare.
+   * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
+   * tree-pass.h (make_pass_ch_oacc_kernels): Declare
+   * tree-ssa-loop-ch.c: Include omp-low.h.
+   (pass_ch_execute): Declare.
+   (pass_ch::execute): Factor out ...
+   (pass_ch_execute): ... this new function.  If handling oacc kernels,
+   skip loops that are not in oacc kernels region.
+   (pass_ch_oacc_kernels::execute):
+   (pass_data_ch_oacc_kernels): New pass_data.
+   (class pass_ch_oacc_kernels): New pass.
+   (pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
+   function.
+
* passes.def: Add pass group pass_oacc_kernels.
* tree-pass.h (make_pass_oacc_kernels): Declare.
* tree-ssa-loop.c (gate_oacc_kernels): New static function.
diff --git gcc/omp-low.c gcc/omp-low.c
index 16d9a5e..1b03ae6 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -13920,4 +13920,95 @@ gimple_stmt_omp_data_i_init_p (gimple stmt)
   SSA_OP_DEF);
 }
 
+/* Return true if LOOP is inside a kernels region.  */
+
+bool
+loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry,
+  basic_block *region_exit)
+{
+  bitmap excludes_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap region_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap_clear (region_bitmap);
+
+  if (region_entry != NULL)
+*region_entry = NULL;
+  if (region_exit != NULL)
+*region_exit = NULL;
+
+  basic_block bb;
+  gimple last;
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  if (bitmap_bit_p (region_bitmap, bb->index))
+   continue;
+
+  last = last_stmt (bb);
+  if (!last)
+   continue;
+
+

Re: [PATCH, 2/8] Add pass_oacc_kernels

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:25:35 +0100, Tom de Vries  wrote:
> On 15-11-14 18:20, Tom de Vries wrote:
> > On 15-11-14 13:14, Tom de Vries wrote:
> >> I'm submitting a patch series with initial support for the oacc kernels
> >> directive.
> >>
> >> The patch series uses pass_parallelize_loops to implement parallelization 
> >> of
> >> loops in the oacc kernels region.
> >>
> >> The patch series consists of these 8 patches:
> >> ...
> >>  1  Expand oacc kernels after pass_build_ealias
> >>  2  Add pass_oacc_kernels
> >>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>  5  Add pass_loop_im to pass_oacc_kernels
> >>  6  Add pass_ccp to pass_oacc_kernels
> >>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>  8  Do simple omp lowering for no address taken var
> >> ...
> >
> > This patch adds a pass group pass_oacc_kernels.
> >
> > The rationale is that we want a pass group to run oacc kernels region 
> > related
> > (optimization) passes in.
> >
> 
> Updated for moving pass_oacc_kernels down past pass_fre in the pass list.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r80:

commit 0ac5f6ae679a0cd70b197f0962d7d365e7dfbd21
Author: tschwinge 
Date:   Tue Apr 21 19:45:23 2015 +

Add pass_oacc_kernels

gcc/
* passes.def: Add pass group pass_oacc_kernels.
* tree-pass.h (make_pass_oacc_kernels): Declare.
* tree-ssa-loop.c (gate_oacc_kernels): New static function.
(pass_data_oacc_kernels): New pass_data.
(class pass_oacc_kernels): New pass.
(make_pass_oacc_kernels): New function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@80 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp  |7 +++
 gcc/passes.def  |7 ++-
 gcc/tree-pass.h |1 +
 gcc/tree-ssa-loop.c |   45 +
 4 files changed, 59 insertions(+), 1 deletion(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1f86160..8a53ad8 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,12 @@
 2015-04-21  Tom de Vries  
 
+   * passes.def: Add pass group pass_oacc_kernels.
+   * tree-pass.h (make_pass_oacc_kernels): Declare.
+   * tree-ssa-loop.c (gate_oacc_kernels): New static function.
+   (pass_data_oacc_kernels): New pass_data.
+   (class pass_oacc_kernels): New pass.
+   (make_pass_oacc_kernels): New function.
+
* omp-low.c: Include gimple-pretty-print.h.
(release_first_vuse_in_edge_dest): New function.
(expand_omp_target): When not in ssa, don't split off oacc kernels
diff --git gcc/passes.def gcc/passes.def
index db0dd18..854c5b8 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -86,7 +86,12 @@ along with GCC; see the file COPYING3.  If not see
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
  NEXT_PASS (pass_fre);
- NEXT_PASS (pass_expand_omp_ssa);
+ /* Pass group that runs when there are oacc kernels in the
+function.  */
+ NEXT_PASS (pass_oacc_kernels);
+ PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+ NEXT_PASS (pass_expand_omp_ssa);
+ POP_INSERT_PASSES ()
  NEXT_PASS (pass_merge_phi);
  NEXT_PASS (pass_cd_dce);
  NEXT_PASS (pass_early_ipa_sra);
diff --git gcc/tree-pass.h gcc/tree-pass.h
index b59ae7a..35778f2 100644
--- gcc/tree-pass.h
+++ gcc/tree-pass.h
@@ -450,6 +450,7 @@ extern gimple_opt_pass *make_pass_strength_reduction 
(gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ubsan (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c
index ccb8f97..a041858 100644
--- gcc/tree-ssa-loop.c
+++ gcc/tree-ssa-loop.c
@@ -163,6 +163,51 @@ make_pass_tree_loop (gcc::context *ctxt)
   return new pass_tree_loop (ctxt);
 }
 
+/* Gate for oacc kernels pass group.  */
+
+static bool
+gate_oacc_kernels (function *fn)
+{
+  return (fn->curr_properties & PROP_gimple_eomp) == 0;
+}
+
+/* The oacc kernels superpass.  */
+
+namespace {
+
+const pass_data pass_data_oacc_kernels =
+{
+  GIMPLE_PASS, /* type */
+  "oacc_kernels", /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_TREE_LOOP, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_oacc_kernels : public gimple_opt_pass
+{
+public:
+  pass_oacc_kernels (gcc::context *ctxt)
+: gimple_opt_pass (pass_data_oacc_ker

Expand oacc kernels after pass_fre (was: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias)

2015-04-21 Thread Thomas Schwinge
Hi!

On Tue, 25 Nov 2014 12:22:02 +0100, Tom de Vries  wrote:
> On 24-11-14 11:56, Tom de Vries wrote:
> > On 15-11-14 18:19, Tom de Vries wrote:
> >> On 15-11-14 13:14, Tom de Vries wrote:
> >>> I'm submitting a patch series with initial support for the oacc kernels
> >>> directive.
> >>>
> >>> The patch series uses pass_parallelize_loops to implement parallelization 
> >>> of
> >>> loops in the oacc kernels region.
> >>>
> >>> The patch series consists of these 8 patches:
> >>> ...
> >>>  1  Expand oacc kernels after pass_build_ealias
> >>>  2  Add pass_oacc_kernels
> >>>  3  Add pass_ch_oacc_kernels to pass_oacc_kernels
> >>>  4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
> >>>  5  Add pass_loop_im to pass_oacc_kernels
> >>>  6  Add pass_ccp to pass_oacc_kernels
> >>>  7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
> >>>  8  Do simple omp lowering for no address taken var
> >>> ...
> >>
> >> This patch moves omp expansion of the oacc kernels directive to after
> >> pass_build_ealias.
> >>
> >> The rationale is that in order to use pass_parallelize_loops for analysis 
> >> and
> >> transformation of an oacc kernels region, we postpone omp expansion of that
> >> region until the earliest point in the pass list where enough information 
> >> is
> >> availabe to run pass_parallelize_loops, in other words, after 
> >> pass_build_ealias.
> >>
> >> The patch postpones expansion in expand_omp, and ensures expansion by 
> >> adding
> >> pass_expand_omp_ssa:
> >> - after pass_build_ealias, and
> >> - after pass_all_early_optimizations for the case we're not optimizing.
> >>
> >> In order to make sure the oacc kernels region arrives at 
> >> pass_expand_omp_ssa,
> >> the way it left expand_omp, the patch makes pass_ccp and pass_forwprop 
> >> aware of
> >> lowered omp code, to handle it conservatively.
> >>
> >> The patch contains changes in expand_omp_target to deal with ssa-code, 
> >> similar
> >> to what is already present in expand_omp_taskreg.
> >>
> >> Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to 
> >> not be
> >> static for oacc kernels. It does this to get some references to 
> >> .omp_data_sizes
> >> and .omp_data_kinds in the ssa code.  Without these references, the 
> >> definitions
> >> will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is 
> >> not
> >> enough to have them not removed. [ In vries/oacc-kernels, I used a 
> >> BUILT_IN_USE
> >> kludge for this purpose ].
> >>
> >> Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the
> >> original function of which the definition has been removed (as in moved to 
> >> the
> >> split off function). TODO_remove_unused_locals takes care of some of them, 
> >> but
> >> not the anonymous ones. So the patch iterates over all SSA_NAMEs to find 
> >> these
> >> dangling SSA_NAMEs and releases them.
> >>
> >
> > Reposting with small update: I've replaced the use of the rather generic
> > gimple_stmt_omp_lowering_p with the more specific 
> > gimple_stmt_omp_data_i_init_p.
> >
> > Bootstrapped and reg-tested in the same way as before.
> >
> 
> I've moved pass_expand_omp_ssa one down in the pass list, past pass_fre.
> 
> This allows fre to unify references to the same omp variable before entering 
> pass_oacc_kernels, which helps pass_lim in pass_oacc_kernels.
> 
> F.i. this reduction fragment:
> ...
># VUSE <.MEM_8>
># PT = { D.2282 }
>_67 = .omp_data_i_59->sumD.2270;
># VUSE <.MEM_8>
>_68 = *_67;
> 
>_70 = _66 + _68;
> 
># VUSE <.MEM_8>
># PT = { D.2282 }
>_69 = .omp_data_i_59->sumD.2270;
># .MEM_71 = VDEF <.MEM_8>
>*_69 = _70;
> ...
> 
> is transformed by fre into:
> ...
># VUSE <.MEM_8>
># PT = { D.2282 }
>_67 = .omp_data_i_59->sumD.2270;
># VUSE <.MEM_8>
>_68 = *_67;
> 
>_70 = _66 + _68;
> 
># .MEM_71 = VDEF <.MEM_8>
>*_67 = _70;
> ...
> 
> In order for pass_fre to respect the kernels region boundaries, I've added a 
> change in tree-ssa-sccvn.c:visit_use to handle the .omp_data_i init 
> conservatively.
> 
> Bootstrapped and reg-tested as before.
> 
> OK for trunk?

Committed to gomp-4_0-branch in r79:

commit 93557ac5e30c26ee1a3d1255e31265b287171a0d
Author: tschwinge 
Date:   Tue Apr 21 19:37:19 2015 +

Expand oacc kernels after pass_fre

gcc/
* omp-low.c: Include gimple-pretty-print.h.
(release_first_vuse_in_edge_dest): New function.
(expand_omp_target): When not in ssa, don't split off oacc kernels
region, clear PROP_gimple_eomp in cfun->curr_properties to force later
expanssion, and add GOACC_kernels_internal call.
When in ssa, split off oacc kernels and convert GOACC_kernels_internal
into GOACC_kernels call.  Handle ssa-code.
(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
properties_provided field.
(pass_expand_omp::execute): Set PROP_g

Add BUILT_IN_GOACC_KERNELS_INTERNAL (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries  wrote:
> I'm submitting a patch series with initial support for the oacc kernels 
> directive.
> 
> The patch series uses pass_parallelize_loops to implement parallelization of 
> loops in the oacc kernels region.

Committed to gomp-4_0-branch in r78:

commit fd3add90d38d5f1b38c9cb557404542b6383b2b0
Author: tschwinge 
Date:   Tue Apr 21 19:24:57 2015 +

Add BUILT_IN_GOACC_KERNELS_INTERNAL

..., a variant of the GOACC_kernels builtin.  This variant does not call the
function passed as function pointer, and therefore is less of an 
optimization
barrier than the original variant.

The purpose of this variant is to allow the introduction of the 
GOACC_kernels
call before splitting off the region body into a function (something that is
currently done simultaneously).

gcc/
* builtin-attrs.def (DOT_DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
(ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST): Add
DEF_ATTR_TREE_LIST.
* omp-builtins.def (BUILT_IN_GOACC_KERNELS_INTERNAL): Add
DEF_GOACC_BUILTIN_FNSPEC.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@78 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp|6 ++
 gcc/builtin-attrs.def |4 
 gcc/omp-builtins.def  |5 +
 3 files changed, 15 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index b091dd5..7885189 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,11 @@
 2015-04-21  Tom de Vries  
 
+   * builtin-attrs.def (DOT_DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
+   (ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST): Add
+   DEF_ATTR_TREE_LIST.
+   * omp-builtins.def (BUILT_IN_GOACC_KERNELS_INTERNAL): Add
+   DEF_GOACC_BUILTIN_FNSPEC.
+
* builtins.def (DEF_GOACC_BUILTIN_FNSPEC): Define.
 
 2015-03-21  Tom de Vries  
diff --git gcc/builtin-attrs.def gcc/builtin-attrs.def
index 1338644..8eca053 100644
--- gcc/builtin-attrs.def
+++ gcc/builtin-attrs.def
@@ -64,6 +64,7 @@ DEF_ATTR_FOR_INT (6)
   DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL, \
  ATTR_##ENUM, ATTR_NULL)
 DEF_ATTR_FOR_STRING (STR1, "1")
+DEF_ATTR_FOR_STRING (DOT_DOT_DOT_r_r_r, "...rrr")
 #undef DEF_ATTR_FOR_STRING
 
 /* Construct a tree for a list of two integers.  */
@@ -127,6 +128,9 @@ DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LIST, ATTR_PURE,  
\
ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LEAF_LIST, ATTR_PURE,\
ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST, \
+   ATTR_FNSPEC, ATTR_LIST_DOT_DOT_DOT_r_r_r, \
+   ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, ATTR_NORETURN, \
ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
diff --git gcc/omp-builtins.def gcc/omp-builtins.def
index 03955c4..cd273f2 100644
--- gcc/omp-builtins.def
+++ gcc/omp-builtins.def
@@ -39,6 +39,11 @@ DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DATA_END, "GOACC_data_end",
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_ENTER_EXIT_DATA, "GOACC_enter_exit_data",
   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_INT_INT_VAR,
   ATTR_NOTHROW_LIST)
+DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC_KERNELS_INTERNAL,
+ "GOACC_kernels_internal",
+ 
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR,
+ ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
+ ATTR_NOTHROW_LIST, "...rrr")
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_KERNELS, "GOACC_kernels",
   
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR,
   ATTR_NOTHROW_LIST)


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH v3][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-21 Thread Maciej W. Rozycki
On Tue, 21 Apr 2015, Petar Jovanovic wrote:

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/call-from-init.c
> @@ -0,0 +1,10 @@
> +/* Check that __do_global_ctors_aux can be reached from .init section that
> +   is in a different (256MB) region. */
> +/* { dg-do run } */
> +/* { dg-options "-Wl,--section-start=.init=0x0FF0" } */
> +/* { dg-options "-Wl,--section-start=.text=0x1000" } */

 Hmm, the addresses should work for any virtual-memory targets, however if 
this is going to be a run-time test, then not for bare-iron ones, as they 
won't normally support mapped addresses.  And we may not be able to come 
up with better ones, because a typical bare-iron target will often not 
have enough memory to span a 256MB boundary.  I think this will best be 
reduced to a link-only test on bare iron, hoping for a link failure.

 Speaking of which, have you been able to make a linker test case out of 
this example for a bug report against binutils?  Not necessarily a proper 
LD test suite addition, I wouldn't be asking for *that*!  Just a small 
case will do, e.g. a pair of .s files generated out of this source and the 
generated crtbegin.s file, preferably with unrelated clutter removed, 
together with a recipe how to assemble them and link to show the missing 
error message.  That will certainly help covering this issue all and for 
good!

 Thanks,

  Maciej


Re: [PATCH, fortran] Add gfc_define_builtin_with_spec

2015-04-21 Thread Thomas Schwinge
Hi!

On Fri, 9 Jan 2015 16:37:00 +0100, Tom de Vries  wrote:
> For the oacc kernels patch series I need a fortran builtin with fn spec 
> attribute (as mentioned here: 
> https://gcc.gnu.org/ml/gcc/2014-12/msg1.html ).
> 
> Attached patch adds a function gfc_define_builtin_with_spec that allows me to 
> define such a builtin.
> 
> At this point there's no user yet in trunk, so I've declared it unused.
> 
> Bootstrapped and reg-tested on x86_64.
> 
> OK for stage 3 trunk?

Committed to gomp-4_0-branch in r75:

commit ee0a44a49648f1addce78a6765bcbf6a14f237c2
Author: tschwinge 
Date:   Tue Apr 21 18:24:48 2015 +

Add gfc_define_builtin_with_spec

gcc/fortran/
* f95-lang.c (gfc_define_builtin_with_spec): New function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@75 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog.gomp |4 
 gcc/fortran/f95-lang.c |   21 +
 2 files changed, 25 insertions(+)

diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp
index 02a1aeb..8c23900 100644
--- gcc/fortran/ChangeLog.gomp
+++ gcc/fortran/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2015-04-21  Tom de Vries  
+
+   * f95-lang.c (gfc_define_builtin_with_spec): New function.
+
 2015-01-13  Thomas Schwinge  
 
* trans-openmp.c (gfc_omp_finish_clause, gfc_trans_omp_clauses):
diff --git gcc/fortran/f95-lang.c gcc/fortran/f95-lang.c
index de9c813..1a14860 100644
--- gcc/fortran/f95-lang.c
+++ gcc/fortran/f95-lang.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "wide-int.h"
 #include "inchash.h"
 #include "tree.h"
+#include "stringpool.h"
 #include "flags.h"
 #include "langhooks.h"
 #include "langhooks-def.h"
@@ -600,6 +601,26 @@ gfc_define_builtin (const char *name, tree type, enum 
built_in_function code,
   set_builtin_decl (code, decl, true);
 }
 
+/* Like gfc_define_builtin, but with fn spec attribute FNSPEC.  */
+
+static void ATTRIBUTE_UNUSED
+gfc_define_builtin_with_spec (const char *name, tree fntype,
+ enum built_in_function code,
+ const char *library_name, int attr,
+ const char *fnspec)
+{
+  if (fnspec)
+{
+  /* Code copied from build_library_function_decl_1.  */
+  tree attr_args = build_tree_list (NULL_TREE,
+   build_string (strlen (fnspec), fnspec));
+  tree attrs = tree_cons (get_identifier ("fn spec"),
+ attr_args, TYPE_ATTRIBUTES (fntype));
+  fntype = build_type_attribute_variant (fntype, attrs);
+}
+
+  gfc_define_builtin (name, fntype, code, library_name, attr);
+}
 
 #define DO_DEFINE_MATH_BUILTIN(code, name, argtype, tbase) \
 gfc_define_builtin ("__builtin_" name "l", tbase##longdouble[argtype], \


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Manuel López-Ibáñez

On 21/04/15 18:07, David Malcolm wrote:


I have the patch working now for the C++ frontend.  Am attaching the
work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
   63 new "PASS" results in gcc.sum
   189 new "PASS" results in g++.sum
for the new test cases (relative to a control build of r48).



I still do not understand why you need so much complexity as I explained here: 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00830.html


The attached patch passes all your tests except Wmisleading-indentation-3.c, 
which warns only once instead of two times (it doesn't seem a big loss to me), 
and Wmisleading-indentation-7.c which I did not bother to implement but it is 
straightforward application of the if-case to the else-case.


Perhaps I'm missing something that is not reflected in your tests?

BTW, the start-up cost of GCC is not negligible, thus grouping similar 
testcases in a single file may pay off in the long term. Many small files also 
tend to slow down VC tools. It also makes harder to see what is tested and what 
is missing.


Cheers,

Manuel.
Index: c/c-parser.c
===
--- c/c-parser.c	(revision 222087)
+++ c/c-parser.c	(working copy)
@@ -5174,20 +5174,34 @@ c_parser_c99_block_statement (c_parser *
   location_t loc = c_parser_peek_token (parser)->location;
   c_parser_statement (parser);
   return c_end_compound_stmt (loc, block, flag_isoc99);
 }
 
+static void
+warn_for_misleading_indentation (location_t guard_loc, location_t body_loc, location_t next_stmt_loc,
+			 const char *s)
+{
+  if (LOCATION_FILE (next_stmt_loc) == LOCATION_FILE (body_loc)
+  && (LOCATION_LINE (next_stmt_loc) == LOCATION_LINE (body_loc)
+	  || (LOCATION_LINE (next_stmt_loc) > LOCATION_LINE (body_loc)
+	  && LOCATION_COLUMN (next_stmt_loc) == LOCATION_COLUMN (body_loc
+if (warning_at (next_stmt_loc, OPT_Wmisleading_indentation,
+		"statement is indented as if it were guarded by..."))
+  inform (guard_loc,
+	  "...this %qs clause, but it is not", s);
+}
+
 /* Parse the body of an if statement.  This is just parsing a
statement but (a) it is a block in C99, (b) we track whether the
body is an if statement for the sake of -Wparentheses warnings, (c)
we handle an empty body specially for the sake of -Wempty-body
warnings, and (d) we call parser_compound_statement directly
because c_parser_statement_after_labels resets
parser->in_if_block.  */
 
 static tree
-c_parser_if_body (c_parser *parser, bool *if_p)
+c_parser_if_body (c_parser *parser, bool *if_p, location_t if_loc)
 {
   tree block = c_begin_compound_stmt (flag_isoc99);
   location_t body_loc = c_parser_peek_token (parser)->location;
   c_parser_all_labels (parser);
   *if_p = c_parser_next_token_is_keyword (parser, RID_IF);
@@ -5201,11 +5215,16 @@ c_parser_if_body (c_parser *parser, bool
 		"suggest braces around empty body in an % statement");
 }
   else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
 add_stmt (c_parser_compound_statement (parser));
   else
-c_parser_statement_after_labels (parser);
+{
+  c_parser_statement_after_labels (parser);
+  if (!c_parser_next_token_is_keyword (parser, RID_ELSE))
+	warn_for_misleading_indentation (if_loc, body_loc, c_parser_peek_token (parser)->location, "if");
+}
+
   return c_end_compound_stmt (body_loc, block, flag_isoc99);
 }
 
 /* Parse the else body of an if statement.  This is just parsing a
statement but (a) it is a block in C99, (b) we handle an empty body
@@ -5248,10 +5267,11 @@ c_parser_if_statement (c_parser *parser)
   tree first_body, second_body;
   bool in_if_block;
   tree if_stmt;
 
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_IF));
+  location_t if_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
   block = c_begin_compound_stmt (flag_isoc99);
   loc = c_parser_peek_token (parser)->location;
   cond = c_parser_paren_condition (parser);
   if (flag_cilkplus && contains_cilk_spawn_stmt (cond))
@@ -5259,11 +5279,11 @@ c_parser_if_statement (c_parser *parser)
   error_at (loc, "if statement cannot contain %");
   cond = error_mark_node;
 }
   in_if_block = parser->in_if_block;
   parser->in_if_block = true;
-  first_body = c_parser_if_body (parser, &first_if);
+  first_body = c_parser_if_body (parser, &first_if, if_loc);
   parser->in_if_block = in_if_block;
   if (c_parser_next_token_is_keyword (parser, RID_ELSE))
 {
   c_parser_consume_token (parser);
   second_body = c_parser_else_body (parser);
@@ -5344,10 +5364,11 @@ static void
 c_parser_while_statement (c_parser *parser, bool ivdep)
 {
   tree block, cond, body, save_break, save_cont;
   location_t loc;
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_WHILE));
+  location_t while_loc = c_parser_peek_token (parser)->location;
   c_parser_co

RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-21 Thread Petar Jovanovic
-Original Message-
From: Moore, Catherine [mailto:catherine_mo...@mentor.com] 
Sent: Friday, April 17, 2015 8:36 PM
To: Petar Jovanovic
Cc: Maciej W. Rozycki; Matthew Fortune; gcc-patches@gcc.gnu.org
Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

> 
> Hi Petar,
> Running the executable is fine, but didn't you say that your test case takes 
> hours to execute?
> If so, that's not acceptable.  Is it possible to construct a test case that 
> will run more quickly?
> Thanks,
> Catherine

Hi Catherine,

I was just raising concerns about the original example (that did run
long). Anyway, I have just uploaded a different example that triggers
the issue.

Regards,
Petar



[patch, libgfortran] PR65234 Output descriptor (*(1E15.7)) not accepted

2015-04-21 Thread Jerry DeLisle

I have had this simple patch in my trunk for quite some time and it has tested 
OK.

I plan to commit with a test case based on the one in the PR today.

Regards,

Jerry

2015-04-21 Jerry DeLisle  

PR libgfortran/65234
* io/format.c (parse_format_list): Set the seen_dd flag in all
cases where a data descriptor has been seen.
Index: format.c
===
--- format.c	(revision 222194)
+++ format.c	(working copy)
@@ -624,6 +624,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
   get_fnode (fmt, &head, &tail, FMT_LPAREN);
   tail->repeat = -2;  /* Signifies unlimited format.  */
   tail->u.child = parse_format_list (dtp, &seen_data_desc);
+  *seen_dd = seen_data_desc;
   if (fmt->error != NULL)
 	goto finished;
   if (!seen_data_desc)
@@ -851,6 +852,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
   switch (t)
 {
 case FMT_L:
+  *seen_dd = true;
   t = format_lex (fmt);
   if (t != FMT_POSINT)
 	{
@@ -873,6 +875,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
   break;
 
 case FMT_A:
+  *seen_dd = true;
   t = format_lex (fmt);
   if (t == FMT_ZERO)
 	{
@@ -897,6 +900,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
 case FMT_G:
 case FMT_EN:
 case FMT_ES:
+  *seen_dd = true;
   get_fnode (fmt, &head, &tail, t);
   tail->repeat = repeat;
 
@@ -903,6 +907,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
   u = format_lex (fmt);
   if (t == FMT_G && u == FMT_ZERO)
 	{
+	  *seen_dd = true;
 	  if (notification_std (GFC_STD_F2008) == NOTIFICATION_ERROR
 	  || dtp->u.p.mode == READING)
 	{
@@ -928,6 +933,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
 	}
   if (t == FMT_F && dtp->u.p.mode == WRITING)
 	{
+	  *seen_dd = true;
 	  if (u != FMT_POSINT && u != FMT_ZERO)
 	{
 	  fmt->error = nonneg_required;
@@ -969,9 +975,11 @@ parse_format_list (st_parameter_dt *dtp, bool *see
   tail->u.real.e = -1;
 
   if (t2 == FMT_D || t2 == FMT_F)
-	break;
+	{
+	  *seen_dd = true;
+	  break;
+	}
 
-
   /* Look for optional exponent */
   t = format_lex (fmt);
   if (t != FMT_E)
@@ -1011,6 +1019,7 @@ parse_format_list (st_parameter_dt *dtp, bool *see
 case FMT_B:
 case FMT_O:
 case FMT_Z:
+  *seen_dd = true;
   get_fnode (fmt, &head, &tail, t);
   tail->repeat = repeat;
 


Re: [RFC] Dynamically aligning the stack

2015-04-21 Thread H.J. Lu
On Tue, Apr 21, 2015 at 9:52 AM, Steve Ellcey  wrote:
> On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote:
>
>> We have done just that in GCC 4.4 to implement dynamic stack
>> alignment on x86 :-).  Some of x86 backend changes for dynamic
>> stack alignment are x86 psABI specific.  Others are historical,
>> like -mstackrealign. which was the old attempt for dynamic stack
>> alignment.
>
> I am a bit confused about the history of stack alignment on x86.  So I
> guess -mpreferred-stack-boundary=X came first and is not
> obsolete/depreciated. But I thought -mstackrealign=X was the current
> method of aligning the stack, but based on this comment and the patches
> you pointed me at I guess this is also obsolete (or at least deprecated)
> and that -mincoming-stack-boundary=X is the current option that should
> be used.  But I am not sure how this option works.

-mpreferred-stack-boundary=X and -mincoming-stack-boundary=X
set stack alignment.  -mstackrealign=X:

'-mstackrealign'
 Realign the stack at entry.  On the Intel x86, the '-mstackrealign'
 option generates an alternate prologue and epilogue that realigns
 the run-time stack if necessary.  This supports mixing legacy codes
 that keep 4-byte stack alignment with modern codes that keep
 16-byte stack alignment for SSE compatibility.  See also the
 attribute 'force_align_arg_pointer', applicable to individual
 functions.

assumes 4-byte incoming stack alignment in 32-bit.   It isn't needed
in most cases since GCC has been generating 16-byte outgoing
stack alignment for ages.

> Obviously it tells GCC what assumption to make about stack alignment at
> the start of a function but how do you tell GCC what alignment you want
> for the function?  Or does GCC figure that out for itself based on the
> instructions and data types it sees in the function?
>

Please do

# git grep "stack_alignment_needed = "

to see how middle-end and backend track stack alignment requirement.

-- 
H.J.


[PATCH v3][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-21 Thread Petar Jovanovic
New patch, v3. PTAL.

Regards,
Petar

gcc/ChangeLog:

2015-04-21  Petar Jovanovic  

* config/mips/mips.h (CRT_CALL_STATIC_FUNCTION): Fix the macro to
use
la/jalr instead of jal.

gcc/testsuite/ChangeLog:

2015-04-21  Petar Jovanovic  

* gcc.target/mips/call-from-init.c: New test.
* gcc.target/mips/mips.exp: Add section_start to mips_option_groups.

commit 566564bd6d80fd6b5ebd6b8eccf09e3716246930
Author: Petar Jovanovic 
Date:   Thu Jan 22 02:15:22 2015 +0100

[mips] fix CRT_CALL_STATIC_FUNCTION macro

jal can not reach a target in different region, so replace it with
la/jalr variant.

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index ec69ed5..4bd83f5 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -3034,11 +3034,11 @@ while (0)
nop\n\
 1: .cpload $31\n\
.set reorder\n\
-   jal " USER_LABEL_PREFIX #FUNC "\n\
+   la $25, " USER_LABEL_PREFIX #FUNC "\n\
+   jalr $25\n\
.set pop\n\
" TEXT_SECTION_ASM_OP);
-#elif ((defined _ABIN32 && _MIPS_SIM == _ABIN32) \
-   || (defined _ABI64 && _MIPS_SIM == _ABI64))
+#elif (defined _ABIN32 && _MIPS_SIM == _ABIN32)
 #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC) \
asm (SECTION_OP "\n\
.set push\n\
@@ -3048,7 +3048,22 @@ while (0)
nop\n\
 1: .set reorder\n\
.cpsetup $31, $2, 1b\n\
-   jal " USER_LABEL_PREFIX #FUNC "\n\
+   la $25, " USER_LABEL_PREFIX #FUNC "\n\
+   jalr $25\n\
+   .set pop\n\
+   " TEXT_SECTION_ASM_OP);
+#elif (defined _ABI64 && _MIPS_SIM == _ABI64)
+#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC) \
+   asm (SECTION_OP "\n\
+   .set push\n\
+   .set nomips16\n\
+   .set noreorder\n\
+   bal 1f\n\
+   nop\n\
+1: .set reorder\n\
+   .cpsetup $31, $2, 1b\n\
+   dla $25, " USER_LABEL_PREFIX #FUNC "\n\
+   jalr $25\n\
.set pop\n\
" TEXT_SECTION_ASM_OP);
 #endif
diff --git a/gcc/testsuite/gcc.target/mips/call-from-init.c
b/gcc/testsuite/gcc.target/mips/call-from-init.c
new file mode 100644
index 000..ee00a17
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/call-from-init.c
@@ -0,0 +1,10 @@
+/* Check that __do_global_ctors_aux can be reached from .init section that
+   is in a different (256MB) region. */
+/* { dg-do run } */
+/* { dg-options "-Wl,--section-start=.init=0x0FF0" } */
+/* { dg-options "-Wl,--section-start=.text=0x1000" } */
+
+int
+main (void) {
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
b/gcc/testsuite/gcc.target/mips/mips.exp
index a0980a9..1dd4173 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -254,6 +254,7 @@ set mips_option_groups {
 madd "HAS_MADD"
 maddps "HAS_MADDPS"
 lsa "(|!)HAS_LSA"
+section_start "-Wl,--section-start=.*"
 }
 
 for { set option 0 } { $option < 32 } { incr option } {



Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-21 Thread Kyrill Tkachov


On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

  From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.


So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or smaller.
+   If the two are not statically comparable (for example, X and Y contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

static int
memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
{
#ifdef ARGS_GROW_DOWNWARD
  rtx tmp = plus_constant (Pmode, x, -size);
#else
  rtx tmp = plus_constant (Pmode, x, size);
#endif
  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);

  if (!CONST_INT_P (sub))
return -1;

#ifdef ARGS_GROW_DOWNWARD
  return INTVAL (-sub);
#else
  return INTVAL (sub);
#endif
}

now, say for x == sp + 4,  y == sp + 8, size == 16:
This would be a problematic case for arm, so this code on arm
(where ARGS_GROW_DOWNWARD is *not* defined) would return
12, which is the number of bytes that overlap.

On a target where ARGS_GROW_DOWNWARD is defined this would return
-20, meaning that no overlap occurs (because we read in the descending
direction from x, IIUC).


Thanks,
Kyrill




Jeff





Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Andrew Hughes
- Original Message -
> On Tue, Apr 21, 2015 at 01:04:04PM -0400, Andrew Hughes wrote:
> > - Original Message -
> > > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> > > > bump the libgcj soname on the trunk, as done for every release cycle,
> > > 
> > > Is that really needed though these days?
> > > Weren't there basically zero changes to libjava (both libjava and
> > > libjava/classpath) in the last 2 or more years?
> > > The few ones were mostly updating Copyright notices, minor configure
> > > changes, but I really haven't seen anything ABI changing for quite a
> > > while.
> > > 
> > 
> > On the Classpath side, there's a bunch of stuff to merge in that would
> > change the ABI. It's a matter of finding a good point at which to do it
> > and time to do so. I keep missing the right point in the gcc lifecycle.
> 
> Now might be a good time (any time next 6.5 months or so), and if that is
> done, surely I have no issue with bumping the soname.
> 

Ok, that should be possible.

>   Jakub
> 

-- 
Andrew :)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222

PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net)
Fingerprint = EC5A 1F5E C0AD 1D15 8F1F  8F91 3B96 A578 248B DC07



Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 01:04:04PM -0400, Andrew Hughes wrote:
> - Original Message -
> > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> > > bump the libgcj soname on the trunk, as done for every release cycle,
> > 
> > Is that really needed though these days?
> > Weren't there basically zero changes to libjava (both libjava and
> > libjava/classpath) in the last 2 or more years?
> > The few ones were mostly updating Copyright notices, minor configure
> > changes, but I really haven't seen anything ABI changing for quite a while.
> > 
> 
> On the Classpath side, there's a bunch of stuff to merge in that would
> change the ABI. It's a matter of finding a good point at which to do it
> and time to do so. I keep missing the right point in the gcc lifecycle.

Now might be a good time (any time next 6.5 months or so), and if that is
done, surely I have no issue with bumping the soname.

Jakub


Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Mike Stump
On Apr 21, 2015, at 9:07 AM, David Malcolm  wrote:
> I think I want to make a distinction between
> 
> (A) classic C "gotchas", like the one in my mail and the:
> 
>  if (cond);
>stmt;
> 
> one you mentioned above
> 
> vs
> 
> (B) wrong/inconsistent indentation.
> 
> I think (A) is high-value, since it detects subtly wrong code, likely to
> have misled the reader, whereas I don't find (B) as interesting.

Ok.  I don’t have any problem with that.  Going for the high value only makes 
the problem space smaller, more likely to implement and do a good job and 
avoids false positives and all sorts of what ifs that the other class would 
expose you to.

I like your work and your plan.

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Andrew Hughes
- Original Message -
> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> > bump the libgcj soname on the trunk, as done for every release cycle,
> 
> Is that really needed though these days?
> Weren't there basically zero changes to libjava (both libjava and
> libjava/classpath) in the last 2 or more years?
> The few ones were mostly updating Copyright notices, minor configure
> changes, but I really haven't seen anything ABI changing for quite a while.
> 

On the Classpath side, there's a bunch of stuff to merge in that would
change the ABI. It's a matter of finding a good point at which to do it
and time to do so. I keep missing the right point in the gcc lifecycle.

>   Jakub
> 

-- 
Andrew :)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222

PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net)
Fingerprint = EC5A 1F5E C0AD 1D15 8F1F  8F91 3B96 A578 248B DC07



Re: [RFC] Dynamically aligning the stack

2015-04-21 Thread Steve Ellcey
On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote:

> We have done just that in GCC 4.4 to implement dynamic stack
> alignment on x86 :-).  Some of x86 backend changes for dynamic
> stack alignment are x86 psABI specific.  Others are historical,
> like -mstackrealign. which was the old attempt for dynamic stack
> alignment.

I am a bit confused about the history of stack alignment on x86.  So I
guess -mpreferred-stack-boundary=X came first and is not
obsolete/depreciated. But I thought -mstackrealign=X was the current
method of aligning the stack, but based on this comment and the patches
you pointed me at I guess this is also obsolete (or at least deprecated)
and that -mincoming-stack-boundary=X is the current option that should
be used.  But I am not sure how this option works.

Obviously it tells GCC what assumption to make about stack alignment at
the start of a function but how do you tell GCC what alignment you want
for the function?  Or does GCC figure that out for itself based on the
instructions and data types it sees in the function?

Steve Ellcey
sell...@imgtec.com




Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Manuel López-Ibáñez

On 21/04/15 18:07, David Malcolm wrote:

On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:

Does it also handle:

if (cone);
   stmt;

?  Would be good to add that to the test suite, as that is another hard to spot 
common error that should be caught.


Not yet, but I agree that it would be a good thing to issue a warning
for.


GCC already warns for the above:

test.c:3:9: warning: suggest braces around empty body in an ‘if’ statement 
[-Wempty-body]

   if (a);
 ^

Cheers,

Manuel.


Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 12:07:00PM -0400, David Malcolm wrote:
> On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:
> > On Apr 16, 2015, at 8:01 AM, David Malcolm  wrote:
> > > Attached is a work-in-progress patch for a new
> > >  -Wmisleading-indentation
> > > warning I've been experimenting with, for GCC 6.
> > 
> > Seems like a nice idea in general.
> > 
> > Does it also handle:
> > 
> > if (cone);
> >   stmt;
> > 
> > ?  Would be good to add that to the test suite, as that is another hard to 
> > spot common error that should be caught.
> 
> Not yet, but I agree that it would be a good thing to issue a warning
> for.
> 
> > I do think that it is reasonable to warn for things like:
> > 
> >   stmt;
> > stmt;
> > 
> > one of those two lines is likely misindented, though, maybe you want to 
> > start with the high payback things first.
> 
> > > An issue here is how to determine (i), or if it's OK to default to 8
> > 
> > Yes, 8 is the proper value to default it to.
> > 
> > > and have a command-line option (param?) to override it? (though what 
> > > about,
> > > say, each header file?)
> > 
> > I’ll abstain from this.  The purist in me says no option for other
> > than 8, life goes on.  20 years ago, someone was confused over hard v
> > soft tabbing and what exactly the editor key TAB does.  That confusion
> > is over, the 8 people have won.  Catering to other than 8 gives the
> > impression that the people that lost still have a chance at
> > winning.  :-)
> > 
> > > Thoughts on this, and on the patch?
> > 
> > Would be nice to have a stricter version that warns about all wildly 
> > inconsistently or wrongly indented lines.
> > 
> > {
> >   stmt;
> > stmt;  // must be same as above
> > }
> > 
> > {
> > stmt; // must be indented at least 1
> > }
> > 
> > if (cond)
> > stmt;  // must be indented at least 1
> 
> I think I want to make a distinction between
> 
> (A) classic C "gotchas", like the one in my mail and the:
> 
>   if (cond);
> stmt;
> 
> one you mentioned above
> 
> vs
> 
> (B) wrong/inconsistent indentation.
> 
> I think (A) is high-value, since it detects subtly wrong code, likely to
> have misled the reader, whereas I don't find (B) as interesting.   I
> think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the
> (B) cases tends to give me a "this code is ugly; beware, danger Will
> Robinson!" reaction, whereas (A) is less ugly and thus more dangerous.

So, while I was working on ifdef stuff in gcc I found the following
pattern

#ifdef FOO
if (FOO)
#endif
  bar ();

  which you may want to handle somehow.  In that sort of case one side
  of the ifdef will necessarily have the B type of miss indentation.

  Trev

> 
> (if that makes sense; this may just be my own visceral reaction to the
> erroneous code).
> 
> Or to put it another way, I hope to make (A) good enough to go into
> -Wall, whereas I think (B) would meet more resistance. 
> Also, I think autogenerated code is more likely to run into (B) than
> (A).
> 
> I have the patch working now for the C++ frontend.  Am attaching the
> work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
> regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
>   63 new "PASS" results in gcc.sum
>   189 new "PASS" results in g++.sum
> for the new test cases (relative to a control build of r48).
> 
> I also moved the visual-parser.c/h to c-family, to make use of the
> -ftabstop option Tom mentioned in another mail.
> 
> I also made it identify the kind of clause, so error messages say things
> like:
> 
> ./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if
> it were guarded by... [-Wmisleading-indentation]
> ./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is
> not
> 
> which makes it easier to read, especially when dealing with nesting.
> 
> This hasn't yet had any performance/leak fixes so it isn't ready as is.
> I plan to look at making it warn about the:
> 
>   if (cond);
> stmt;
> 
> gotcha next, before trying to optimize it.
> 
> (and no ChangeLog yet)
> 
> Dave

> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 80c91f0..8154469 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o 
> c-family/c-cppbuiltin.o c-family/c-dump.o \
>c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
>c-family/c-semantics.o c-family/c-ada-spec.o \
>c-family/c-cilkplus.o \
> -  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o
> +  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \
> +  c-family/visual-parser.o
>  
>  # Language-independent object files.
>  # We put the insn-*.o files first so that a parallel make will build
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 983f4a8..88f1f94 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -554,6 +554,10 @@ Wmemset-transposed-args
>  C ObjC C++ ObjC++ Var(warn_memset_transp

Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-21 Thread David Malcolm
On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote:
> On Apr 16, 2015, at 8:01 AM, David Malcolm  wrote:
> > Attached is a work-in-progress patch for a new
> >  -Wmisleading-indentation
> > warning I've been experimenting with, for GCC 6.
> 
> Seems like a nice idea in general.
> 
> Does it also handle:
> 
> if (cone);
>   stmt;
> 
> ?  Would be good to add that to the test suite, as that is another hard to 
> spot common error that should be caught.

Not yet, but I agree that it would be a good thing to issue a warning
for.

> I do think that it is reasonable to warn for things like:
> 
>   stmt;
> stmt;
> 
> one of those two lines is likely misindented, though, maybe you want to start 
> with the high payback things first.

> > An issue here is how to determine (i), or if it's OK to default to 8
> 
> Yes, 8 is the proper value to default it to.
> 
> > and have a command-line option (param?) to override it? (though what about,
> > say, each header file?)
> 
> I’ll abstain from this.  The purist in me says no option for other
> than 8, life goes on.  20 years ago, someone was confused over hard v
> soft tabbing and what exactly the editor key TAB does.  That confusion
> is over, the 8 people have won.  Catering to other than 8 gives the
> impression that the people that lost still have a chance at
> winning.  :-)
> 
> > Thoughts on this, and on the patch?
> 
> Would be nice to have a stricter version that warns about all wildly 
> inconsistently or wrongly indented lines.
> 
> {
>   stmt;
> stmt;  // must be same as above
> }
> 
> {
> stmt; // must be indented at least 1
> }
> 
> if (cond)
> stmt;  // must be indented at least 1

I think I want to make a distinction between

(A) classic C "gotchas", like the one in my mail and the:

  if (cond);
stmt;

one you mentioned above

vs

(B) wrong/inconsistent indentation.

I think (A) is high-value, since it detects subtly wrong code, likely to
have misled the reader, whereas I don't find (B) as interesting.   I
think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the
(B) cases tends to give me a "this code is ugly; beware, danger Will
Robinson!" reaction, whereas (A) is less ugly and thus more dangerous.

(if that makes sense; this may just be my own visceral reaction to the
erroneous code).

Or to put it another way, I hope to make (A) good enough to go into
-Wall, whereas I think (B) would meet more resistance. 
Also, I think autogenerated code is more likely to run into (B) than
(A).

I have the patch working now for the C++ frontend.  Am attaching the
work-in-progress (sans ChangeLog).  This one (v2) bootstrapped and
regrtested on x86_64-unknown-linux-gnu (Fedora 20), with:
  63 new "PASS" results in gcc.sum
  189 new "PASS" results in g++.sum
for the new test cases (relative to a control build of r48).

I also moved the visual-parser.c/h to c-family, to make use of the
-ftabstop option Tom mentioned in another mail.

I also made it identify the kind of clause, so error messages say things
like:

./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if
it were guarded by... [-Wmisleading-indentation]
./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is
not

which makes it easier to read, especially when dealing with nesting.

This hasn't yet had any performance/leak fixes so it isn't ready as is.
I plan to look at making it warn about the:

  if (cond);
stmt;

gotcha next, before trying to optimize it.

(and no ChangeLog yet)

Dave
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..8154469 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \
   c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
   c-family/c-semantics.o c-family/c-ada-spec.o \
   c-family/c-cilkplus.o \
-  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o
+  c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \
+  c-family/visual-parser.o
 
 # Language-independent object files.
 # We put the insn-*.o files first so that a parallel make will build
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 983f4a8..88f1f94 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -554,6 +554,10 @@ Wmemset-transposed-args
 C ObjC C++ ObjC++ Var(warn_memset_transposed_args) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not
 
+Wmisleading-indentation
+C C++ Common Var(warn_misleading_indentation) Warning
+Warn when the indentation of the code does not reflect the block structure
+
 Wmissing-braces
 C ObjC C++ ObjC++ Var(warn_missing_braces) Warning LangEnabledBy(C ObjC,Wall)
 Warn about possibly missing braces around initializers
diff --git a/gcc/c-family/visual-parser.c b/gcc/c-family/visual-parser.c
new file mode 100644
index 000..b1fcb8b
--- /dev/null

[PATCH][AARCH64]Use mov for add with large immediate.

2015-04-21 Thread Renlin Li

Hi all,

This is a simple patch to generate a move instruction to temporarily 
hold the large immediate for a add instruction.


GCC regression test has been run using aarch64-none-elf toolchain. NO 
new issues.


Okay for trunk?

Regards,
Renlin Li

gcc/ChangeLog:

2015-04-21  Renlin Li  

* config/aarch64/aarch64.md (add3): Use mov when allowed.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1f4169e..9ea1939 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1414,18 +1414,28 @@
   "
   if (! aarch64_plus_operand (operands[2], VOIDmode))
 {
-  rtx subtarget = ((optimize && can_create_pseudo_p ())
-		   ? gen_reg_rtx (mode) : operands[0]);
   HOST_WIDE_INT imm = INTVAL (operands[2]);
-
-  if (imm < 0)
-	imm = -(-imm & ~0xfff);
+  if (aarch64_move_imm (imm, mode)
+	  && can_create_pseudo_p ())
+  {
+	rtx tmp = gen_reg_rtx (mode);
+	emit_move_insn (tmp, operands[2]);
+	operands[2] = tmp;
+  }
   else
-imm &= ~0xfff;
+  {
+	rtx subtarget = ((optimize && can_create_pseudo_p ())
+			 ? gen_reg_rtx (mode) : operands[0]);
+
+	if (imm < 0)
+	  imm = -(-imm & ~0xfff);
+	else
+	  imm &= ~0xfff;
 
-  emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm)));
-  operands[1] = subtarget;
-  operands[2] = GEN_INT (INTVAL (operands[2]) - imm);
+	emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm)));
+	operands[1] = subtarget;
+	operands[2] = GEN_INT (INTVAL (operands[2]) - imm);
+  }
 }
   "
 )


Re: [PATCH][doc] Improve pipeline description docs a bit

2015-04-21 Thread Sandra Loosemore

On 04/20/2015 04:31 AM, Kyrill Tkachov wrote:

Hi all,

This patch attempts to improve the pipeline description documentation.
It fixes some grammar errors,typos and clarifies some concepts.

The sections on the syntactic constructs are formatted to have a
small description, and example, description of syntax elements and some
elaboration.

Is this ok for trunk?

Thanks,
Kyrill

2014-04-20  Kyrylo Tkachov  

* doc/md.texi (Specifying processor pipeline description):
Improve wording.
Clarify some constructs.


H.  I guess overall this is an improvement, but I still see quite a 
few things that need tweaking (and I wasn't even looking very hard).



+latency time}.  Instructions may not complete execution until all inputs
+to the instruction have been evaluated and are available for use.
+Taking data dependence delays into account is simple.


I don't think the above sentence adds anything and could be deleted.


+The data dependence (true, output, and anti-dependence) delay between two
+instructions is modelled as being constant.  In most cases this approach is
+adequate.  The second kind of interlock delays is a reservation delay.
+The reservation delay means that two or more executing instructions will 
require


s/will require/require/


+
+The define_automaton construct declares the names of automata.
+It takes the following form:

 @smallexample
 (define_automaton @var{automata-names})
 @end smallexample

 @var{automata-names} is a string giving names of the automata.  The
-names are separated by commas.  All the automata should have unique names.
-The automaton name is used in the constructions @code{define_cpu_unit} and
-@code{define_query_cpu_unit}.
+names are separated by commas.  All the automata must have unique names.
+The automaton name is used to bind @code{define_cpu_unit} and
+@code{define_query_cpu_unit} constructs to specific automata.
+
+This construct declares the names of automata.


You already said that a few sentences above; delete this one.


+The define_query_cpu_unit construct can be used to define units


Add @code{} markup here.


-@var{default_latency} is a number giving latency time of the
+@var{default_latency} is a number giving the latency of the
 instruction.  There is an important difference between the old
 description and the automaton based pipeline description.  The latency
-time is used for all dependencies when we use the old description.  In
-the automaton based pipeline description, the given latency time is only
-used for true dependencies.  The cost of anti-dependencies is always
-zero and the cost of output dependencies is the difference between
-latency times of the producing and consuming insns (if the difference
-is negative, the cost is considered to be zero).  You can always
-change the default costs for any description by using the target hook
+is used for all types of dependencies when we used the old description.  In
+the automaton based pipeline description, the  latency is only taken into
+account when analysing true dependencies (i.e. not output or
+anti-dependencies).  The cost of anti-dependencies is always zero and the
+cost of output dependencies is the difference between the latencies
+of the producing and consuming insns (if the difference is negative, the
+cost is considered to be zero).  You can always change the default cost
+between any pair of insns by using the target hook
 @code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}).


Here I am confused.  What is the "old description"?  If this is a 
leftover of some obsolete way of doing things, the references to it 
should be deleted.



+construct.  You must avoid having more than one
+@code{define_insn_reservation} matching any one RTL insn, as the behaviour is


s/behaviour/behavior/


+The following construct is used to describe a bypass i.e. an exception
+in the execution latency between a pair of instructions:


@dfn{bypass} ??


 @var{guard} is an optional string giving the name of a C function which
-defines an additional guard for the bypass.  The function will get the
+defines an additional guard for the bypass.  The function will take the
 two insns as parameters.  If the function returns zero the bypass will
 be ignored for this case.  The additional guard is necessary to


s/will take/takes/
s/will be ignored/is ignored/


+If there is more one bypass with the same output and input insns, the
+chosen bypass is the first bypass with a guard function in its definition that
+returns nonzero.  If there is no such bypass, then a bypass without a guard
+function is chosen.  These constructs can be used to describe, for example,
+forwarding paths in a processor pipeline.


I don't understand what the last sentence has to do with the rest of 
this paragraph.  If this is part of the general discussion of what 
define_bypass does, it should be moved up to the paragraph where the 
concept of a bypass is introduced.



-@var{unit-names} is a string giving names o

[WIP] OpenMP 4 NVPTX support

2015-04-21 Thread Jakub Jelinek
Hi!

Attached is a minimal patch to get at least a trivial OpenMP 4.0 testcase
offloading to NVPTX (the first patch).  The second patch is WIP, just first
few needed changes to make libgomp to build for NVPTX (several weeks of work
at least).

The following seems to work and the output suggests that it was offloaded to
a non-SHM arch:

int
main ()
{
  int v = 0;
  int *w = 0;
  int x = 0;
#pragma omp target
  {
v = 6;
w = &v;
x = 1; // omp_is_initial_device ();
  }
  __builtin_printf ("%d %p %p %d\n", v, &v, w, x);
  return 0;
}

but already tiny bit more complicated testcase:

extern void *malloc (__SIZE_TYPE__);
extern void free (void *);

int
main ()
{
  int v = 0;
  int *w = 0;
  int x = 0;
#pragma omp target
  {
v = 6;
w = &v;
char *p = malloc (64);
x = 1; // omp_is_initial_device ();
free (p);
  }
  __builtin_printf ("%d %p %p %d\n", v, &v, w, x);
  return 0;
}

suggests that while it is nice that when building nvptx accel compiler
we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future hopefully 
libgomp.a),
nothing attempts to link those in :(.

Is the plan to link those in at mkoffload time (haven't seen any attempt
of mkoffload to invoke the nvptx-none-ld linker though), or link those in
somehow at link_ptx time in the plugin?
In either case, it isn't clear to me how things will work (if at all) in the
case where multiple shared libraries (or executable and at least one shared
library) have their own offloading bits, and if you try to e.g. call an
offloaded function defined in the shared library from an offloaded kernel in
the executable, because if any library needs some global singleton case, if
it is linked multiple times, no idea what the PTX JIT will do.

Once that is resolved, another thing will be to figure out how to
efficiently implement the TLS libgomp needs for its ICVs and other state
- right now it uses either __thread, or pthread_getspecific, neither of
these is usable of course.  I've been thinking about an array of those
structures in .shared memory indexed by %tid.x, but I guess that runs into
the issue that the array would need to be declared fixed size and there is a
very small size limitation on .shared memory size.
So perhaps a file scope .shared pointer to global memory, where whomever
launches an OpenMP 4.0 kernel (either the libgomp-plugin-nvptx.so.1 doing
GOMP_run, or later on dynamic parallelism from GOMP_target in the nvptx
libgomp.a) allocates the memory and some wrapper sets the .shared variable
to that allocated memory, then calls the kernel?

Jakub
--- libgomp/plugin/plugin-nvptx.c.jj2015-04-21 08:38:00.0 +0200
+++ libgomp/plugin/plugin-nvptx.c   2015-04-21 16:55:25.247470080 +0200
@@ -978,8 +978,8 @@ event_add (enum ptx_event_type type, CUe
 
 void
 nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
- size_t *sizes, unsigned short *kinds, int num_gangs, int num_workers,
- int vector_length, int async, void *targ_mem_desc)
+   size_t *sizes, unsigned short *kinds, int num_gangs,
+   int num_workers, int vector_length, int async, void *targ_mem_desc)
 {
   struct targ_fn_descriptor *targ_fn = (struct targ_fn_descriptor *) fn;
   CUfunction function;
@@ -1137,7 +1137,6 @@ nvptx_host2dev (void *d, const void *h,
   CUresult r;
   CUdeviceptr pb;
   size_t ps;
-  struct nvptx_thread *nvthd = nvptx_thread ();
 
   if (!s)
 return 0;
@@ -1162,7 +1161,8 @@ nvptx_host2dev (void *d, const void *h,
 GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  struct nvptx_thread *nvthd = nvptx_thread ();
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
 {
   CUevent *e;
 
@@ -1202,7 +1202,6 @@ nvptx_dev2host (void *h, const void *d,
   CUresult r;
   CUdeviceptr pb;
   size_t ps;
-  struct nvptx_thread *nvthd = nvptx_thread ();
 
   if (!s)
 return 0;
@@ -1227,7 +1226,8 @@ nvptx_dev2host (void *h, const void *d,
 GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  struct nvptx_thread *nvthd = nvptx_thread ();
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
 {
   CUevent *e;
 
@@ -1559,7 +1559,8 @@ GOMP_OFFLOAD_get_name (void)
 unsigned int
 GOMP_OFFLOAD_get_caps (void)
 {
-  return GOMP_OFFLOAD_CAP_OPENACC_200;
+  return GOMP_OFFLOAD_CAP_OPENACC_200
+| GOMP_OFFLOAD_CAP_OPENMP_400;
 }
 
 int
@@ -1759,7 +1760,7 @@ GOMP_OFFLOAD_openacc_parallel (void (*fn
   void *targ_mem_desc)
 {
   nvptx_exec (fn, mapnum, hostaddrs, devaddrs, sizes, kinds, num_gangs,
-   num_workers, vector_length, async, targ_mem_desc);
+ num_workers, vector_length, async, targ_mem_desc);
 }
 
 void
@@ -1889,3 +1890,27 @@ GOMP_OFFLOAD_openacc_set_cuda_stream (in
 {
   return nvptx_set_cuda_stream (async, stream);
 }
+
+void

Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:57:19AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >Hi,
> >
> >This is a first round of patches to reduce the amount of code with in #if /
> >#ifdef.  This makes it incrementally easier to not break configs other than 
> >the
> >one being built, and moves things slightly closer to using target hooks for
> >everything.
> >
> >each commit bootstrapped and regtested on x86_64-linux-gnu without 
> >regression,
> >and whole patch set run through config-list.mk without issue, ok?
> So I think after looking at this patchset, any changes of a similar nature
> you want to make should be considered pre-approved.  Just post them for
> archival purposes, but no need for you to wait for review as long as they
> have the same purpose and overall structure as was seen in these patches.

thanks!  Its also always nice to have someone double check your logic
:-)

Trev

> 
> jeff
> 


Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 04:14:01PM +0200, Richard Biener wrote:
> On Tue, Apr 21, 2015 at 3:24 PM,   wrote:
> > From: Trevor Saunders 
> >
> > gcc/ChangeLog:
> >
> > 2015-04-21  Trevor Saunders  
> >
> > * conditions.h: Define macros even if HAVE_cc0 is undefined.
> > * emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
> > * final.c: Likewise.
> > * jump.c: Likewise.
> > * recog.c: Likewise.
> > * recog.h: Declare functions even when HAVE_cc0 is undefined.
> > * sched-deps.c (sched_analyze_2): Always compile case for cc0.
> > ---
> >  gcc/conditions.h | 6 --
> >  gcc/emit-rtl.c   | 2 --
> >  gcc/final.c  | 2 --
> >  gcc/jump.c   | 3 ---
> >  gcc/recog.c  | 2 --
> >  gcc/recog.h  | 2 --
> >  gcc/sched-deps.c | 5 +++--
> >  7 files changed, 3 insertions(+), 19 deletions(-)
> >
> > diff --git a/gcc/conditions.h b/gcc/conditions.h
> > index 2308bfc..7cd1e1c 100644
> > --- a/gcc/conditions.h
> > +++ b/gcc/conditions.h
> > @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
> >  #ifndef GCC_CONDITIONS_H
> >  #define GCC_CONDITIONS_H
> >
> > -/* None of the things in the files exist if we don't use CC0.  */
> > -
> > -#ifdef HAVE_cc0
> > -
> >  /* The variable cc_status says how to interpret the condition code.
> > It is set by output routines for an instruction that sets the cc's
> > and examined by output routines for jump instructions.
> > @@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
> >   (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
> >CC_STATUS_MDEP_INIT)
> >
> > -#endif
> > -
> >  #endif /* GCC_CONDITIONS_H */
> > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> > index 483eacb..c1974bb 100644
> > --- a/gcc/emit-rtl.c
> > +++ b/gcc/emit-rtl.c
> > @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
> >return insn;
> >  }
> >
> > -#ifdef HAVE_cc0
> >  /* Return the next insn that uses CC0 after INSN, which is assumed to
> > set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
> > applied to the result of this function should yield INSN).
> > @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
> >
> >return insn;
> >  }
> > -#endif
> >
> >  #ifdef AUTO_INC_DEC
> >  /* Find a RTX_AUTOINC class rtx which matches DATA.  */
> > diff --git a/gcc/final.c b/gcc/final.c
> > index 1fa93d9..41f6bd9 100644
> > --- a/gcc/final.c
> > +++ b/gcc/final.c
> > @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
> >
> >  static int insn_counter = 0;
> >
> > -#ifdef HAVE_cc0
> >  /* This variable contains machine-dependent flags (defined in tm.h)
> > set and examined by output routines
> > that describe how to interpret the condition codes properly.  */
> > @@ -202,7 +201,6 @@ CC_STATUS cc_status;
> > from before the insn.  */
> >
> >  CC_STATUS cc_prev_status;
> > -#endif
> >
> >  /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
> >
> > diff --git a/gcc/jump.c b/gcc/jump.c
> > index 34b3b7b..bc91550 100644
> > --- a/gcc/jump.c
> > +++ b/gcc/jump.c
> > @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
> >   && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL 
> > (insn)));
> >  }
> >
> > -#ifdef HAVE_cc0
> > -
> >  /* Return nonzero if X is an RTX that only sets the condition codes
> > and has no side effects.  */
> >
> > @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
> >  }
> >return 0;
> >  }
> > -#endif
> >
> >  /* Find all CODE_LABELs referred to in X, and increment their use
> > counts.  If INSN is a JUMP_INSN and there is at least one
> > diff --git a/gcc/recog.c b/gcc/recog.c
> > index a9d3b1f..c3ad86f 100644
> > --- a/gcc/recog.c
> > +++ b/gcc/recog.c
> > @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
> >return ((num_changes_pending () > 0) && (apply_change_group () > 0));
> >  }
> >
> > -#ifdef HAVE_cc0
> >  /* Return 1 if the insn using CC0 set by INSN does not contain
> > any ordered tests applied to the condition codes.
> > EQ and NE tests do not count.  */
> > @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
> >return (INSN_P (next)
> >   && ! inequality_comparisons_p (PATTERN (next)));
> >  }
> > -#endif
> >
> >  /* Return 1 if OP is a valid general operand for machine mode MODE.
> > This is either a register reference, a memory reference,
> > diff --git a/gcc/recog.h b/gcc/recog.h
> > index 45ea671..8a38b26 100644
> > --- a/gcc/recog.h
> > +++ b/gcc/recog.h
> > @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
> >  extern void validate_replace_src_group (rtx, rtx, rtx);
> >  extern bool validate_simplify_insn (rtx insn);
> >  extern int num_changes_pending (void);
> > -#ifdef HAVE_cc0
> >  extern int next_insn_tests_no_inequality (rtx);
> > -#endif
> >  extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
> >
> >  extern int offsettable_memref_p (rtx);
> > diff --git 

Re: [PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:51:14AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
> > that is trivially ded on non cc0 targets.
> > (simplify_set): Likewise.
> > (mark_used_regs_combine): Likewise.
> > * cse.c (new_basic_block): Likewise.
> > (fold_rtx): Likewise.
> > (cse_insn): Likewise.
> > (cse_extended_basic_block): Likewise.
> > (set_live_p): Likewise.
> > * rtlanal.c (canonicalize_condition): Likewise.
> > * simplify-rtx.c (simplify_binary_operation_1): Likewise.
> OK.  I find myself wondering if the conditionals should look like
> if (HAVE_cc0
> && (whatever))
> 
> But I doubt it makes any measurable difference.  It's something we can
> always add in the future if we feel the need to avoid the runtime checks for
> things that aren't ever going to happen on most modern targets.

 yeah, it seems reasonably likely the branch predictor can deal with
 this for us (I tried to ensure things handled this way didn't do much
 other than a compare).  If not well that's what profiling is for :-)

 Trev

> 
> jeff
> 


Re: [PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread Trevor Saunders
On Tue, Apr 21, 2015 at 07:53:05AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * genconfig.c (main): Always define HAVE_cc0.
> > * caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
> > HAVE_cc0.
> > * cfgcleanup.c (flow_find_cross_jump): Likewise.
> > (flow_find_head_matching_sequence): Likewise.
> > (try_head_merge_bb): Likewise.
> > * cfgrtl.c (rtl_merge_blocks): Likewise.
> > (try_redirect_by_replacing_jump): Likewise.
> > (rtl_tidy_fallthru_edge): Likewise.
> > * combine.c (do_SUBST_MODE): Likewise.
> > (insn_a_feeds_b): Likewise.
> > (combine_instructions): Likewise.
> > (can_combine_p): Likewise.
> > (try_combine): Likewise.
> > (find_split_point): Likewise.
> > (subst): Likewise.
> > (simplify_set): Likewise.
> > (distribute_notes): Likewise.
> > * cprop.c (cprop_jump): Likewise.
> > * cse.c (cse_extended_basic_block): Likewise.
> > * df-problems.c (can_move_insns_across): Likewise.
> > * final.c (final): Likewise.
> > (final_scan_insn): Likewise.
> > * function.c (emit_use_return_register_into_block): Likewise.
> > * gcse.c (insert_insn_end_basic_block): Likewise.
> > * haifa-sched.c (sched_init): Likewise.
> > * ira.c (find_moveable_pseudos): Likewise.
> > * loop-invariant.c (find_invariant_insn): Likewise.
> > * lra-constraints.c (curr_insn_transform): Likewise.
> > * optabs.c (prepare_cmp_insn): Likewise.
> > * postreload.c (reload_combine_recognize_const_pattern):
> > * Likewise.
> > * reload.c (find_reloads): Likewise.
> > (find_reloads_address_1): Likewise.
> > * reorg.c (delete_scheduled_jump): Likewise.
> > (steal_delay_list_from_target): Likewise.
> > (steal_delay_list_from_fallthrough): Likewise.
> > (try_merge_delay_insns): Likewise.
> > (redundant_insn): Likewise.
> > (fill_simple_delay_slots): Likewise.
> > (fill_slots_from_thread): Likewise.
> > (delete_computation): Likewise.
> > (relax_delay_slots): Likewise.
> > * sched-deps.c (sched_analyze_2): Likewise.
> > * sched-rgn.c (add_branch_dependences): Likewise.
> Doesn't go as far as I'd like, but it's still an improvement.

Yeah, this one really just enables other nice things.  I really dislike
big patches since there's invariably something wrong somewhere and if
you don't really know the code in question it can be next to impossible
to figure out where the problem is.

Trev

> 
> OK.
> 
> jeff
> 


RE: [PATCH 6/13] mips musl support

2015-04-21 Thread Matthew Fortune
Rich Felker  writes:
> On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
> > Szabolcs Nagy  writes:
> > > Set up dynamic linker name for mips.
> > >
> > > gcc/Changelog:
> > >
> > > 2015-04-16  Gregor Richards  
> > >
> > >   * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.
> >
> > I understand that mips musl is o32 only currently is that correct?
> 
> This is correct. Other ABIs if/when we support them will have different
> names.
> 
> > There does however appear to be both soft and hard float variants
> > listed in the musl docs. Do you plan on using the same dynamic linker
> > name for both float variants? No problem if so but someone must have
> > decided to have unique names for big and little endian so I thought
> it
> > worth checking.
> 
> No, it's supposed to be different (-sf suffix for soft float; see
> arch/mips/reloc.h in musl source). If this didn't make it into the
> patches it's an omission, probably because we didn't officially support
> the sf ABI at all for a long time.
> 
> > Also, are you aware of the two nan encoding formats that MIPS has and
> > the support present in glibc's dynamic linker to deal with it?
> 
> I am aware but somewhat skeptical of treating it as yet another
> dimension to ABI and the resulting ABI combinatorics. The vast majority
> of programs couldn't care less which is which and whether a NAN is
> quiet or signaling. Officially we just use the classic mips ABI (with
> qnan/snan swapped vs other archs) but there's no harm in somebody doing
> the opposite if they really know what they're doing.

Couldn't agree more here but I know some people have been concerned about
it so the strict rules were put in place. I will attempt to remember and
copy the musl list when putting out a plan for formally relaxing the nan
encoding rules. The proposal is probably less than 2 weeks away from being
ready to review, it does of course make certain assumptions originating
from glibc as reference but is an independent ABI proposal.
 
> > I wonder if it would be wise to refuse to target musl unless the ABI
> > is known to be supported so that we can avoid compatibility issues
> > when different ABI variants are added in musl.
> 
> Possibly, though this might make bootstrapping new ABIs harder.

Indeed. The other alternative would be to set the dynamic linker name
to something slightly silly for unsupported ABIs like /lib/fixme.so
which would make it possible to bootstrap via the addition of a symlink
but it is clearly not the approved name.

thanks,
Matthew


Re: [PATCH 3/13] aarch64 musl support

2015-04-21 Thread Szabolcs Nagy


On 21/04/15 15:16, pins...@gmail.com wrote:
> 
> I don't think you need to check if defaulting to little or big-endian here 
> are the specs always have one or the other passing through. 
> 

i was not aware of this

may be the ifdef is not necessary for other archs either
i will check

> Also if musl does not support ilp32, you might want to error out. Or even 
> define the dynamic linker name even before support goes into musl. 
> 

ok, i guess adding %{mabi=ilp32:_ilp32} won't hurt us



[patch, avr] extend part-clobbered check to AVR_TINY architecture

2015-04-21 Thread Sivanupandi, Pitchumani
Hi,

When tried backporting AVR_TINY architecture support to 4.9, build failed in 
libgcc for AVR_TINY.
Failure was due to ICE same as:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53065

Fix provided for that bug checks for if the mode crosses the callee saved 
register.
Below patch updates that check as the AVR_TINY has different set of callee 
saved 
registers (r18 and r19).

This patch is against trunk.

NOTE: ICE is re-produciable only with 4.9 + tiny patch and --with-dwarf2 
enabled.

Is this ok for trunk?

diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index 68d5ddc..2f441e5 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -11333,9 +11333,10 @@ avr_hard_regno_call_part_clobbered (unsigned regno, 
machine_mode mode)
 return 0;

   /* Return true if any of the following boundaries is crossed:
- 17/18, 27/28 and 29/30.  */
+ 17/18 or 19/20 (if AVR_TINY), 27/28 and 29/30.  */

-  return ((regno < 18 && regno + GET_MODE_SIZE (mode) > 18)
+  return ((regno <= LAST_CALLEE_SAVED_REG &&
+   regno + GET_MODE_SIZE (mode) > (LAST_CALLEE_SAVED_REG + 1))
   || (regno < REG_Y && regno + GET_MODE_SIZE (mode) > REG_Y)
   || (regno < REG_Z && regno + GET_MODE_SIZE (mode) > REG_Z));
 }


Regards,
Pitchumani



Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64

2015-04-21 Thread Martin Sebor

On 04/21/2015 06:39 AM, Peter Bergner wrote:

On Tue, 2015-04-21 at 08:22 +0200, Jakub Jelinek wrote:

-#if defined(__powerpc__) || defined(__powerpc64__)
-  // PCs are always 4 byte aligned.
-  return pc - 4;
-#elif defined(__sparc__) || defined(__mips__)
-  return pc - 8;


The SPARC/MIPS case is of course needed, because on these architectures
the call is followed by a delay slot.  But I wonder why you need anything
special on any other architecture, why pc - 1 isn't good enough for those.
The point isn't to find a PC of the call instruction, on some targets that
is very hard and you need to disassemble, but to just find some byte in the
call instruction.


I wrote the "pc - 4" code for powerpc* and I guess I was just
being pedantic on returning the first address of the instruction.
If using "pc - 1" works, then I'm fine with that.


It works fine with the patch and produces sensible output
because the decremented address is only used to look up
the debug info and restored before it's output. Otherwise
(with the unpatched code) we'd end up with odd PC addresses
in the stack trace.

Martin



Peter





Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-04-21 Thread Jiong Wang

Jiong Wang writes:

> 2015-04-14 18:24 GMT+01:00 Jeff Law :
>> On 04/14/2015 10:48 AM, Steven Bosscher wrote:

 So I think this stage2/3 binary difference is acceptable?
>>>
>>>
>>> No, they should be identical. If there's a difference, then there's a
>>> bug - which, it seems, you've already found, too.
>>
>> RIght.  And so the natural question is how to fix.
>>
>> At first glance it would seem like having this new code ignore dependencies
>> rising from debug insns would work.
>>
>> Which then begs the question, what happens to the debug insn -- it's
>> certainly not going to be correct anymore if the transformation is made.
>
> Exactly.
>
> The debug_insn 2776 in my example is to record the base address of a
> local array. the new code is doing correctly here by not shuffling the
> operands of insn 2556 and 2557 as there is additional reference of
> reg:1473 from debug insn, although the code will still execute correctly
> if we do the transformation.
>
> my understanding to fix this:
>
>   * delete the out-of-date mismatch debug_insn? as there is no guarantee
> to generate accurate debug info under -O2.
>
> IMO, this debug_insn may affect "DW_AT_location" field for variable
> descrption of "classes" in .debug_info section, but it's omitted in
> the final output already.
>
> <3><38a4d>: Abbrev Number: 137 (DW_TAG_variable)
> <38a4f>   DW_AT_name : (indirect string, offset: 0x18db): classes
> <38a53>   DW_AT_decl_file   : 1
> <38a54>   DW_AT_decl_line   : 548
> <38a56>   DW_AT_type: <0x38cb4>
>
>   * update the debug_insn? if the following change is OK with dwarf standard
>
>from
>
>  insn0: reg0 = fp + reg1
>  debug_insn: var_loc = reg0 + const_off
>  insn1: reg2 = reg0 + const_off
>
>to
>
>  insn0: reg0 = fp + const_off
>  debug_insn: var_loc = reg0 + reg1
>  insn1: reg2 = reg0 + reg1
>
> Thanks,
>

And attachment is the new patch which will update debug_insn as
described in the second solution above.

Now the stage2/3 binary differences on AArch64 gone away. Bootstrap OK.

On AArch64, this patch give 600+ new rtl loop invariants found across
spec2k6 float. +4.5% perf improvement on 436.cactusADM because four new
invariants found in the critical function "regex_compile".

The similar improvements may be achieved on other RISC backends like
powerpc/mips I guess.

One thing to mention, for AArch64, one minor glitch in
aarch64_legitimize_address needs to be fixed to let this patch take
effect, I will send out that patch later as it's a seperate issue.
Powerpc/Mips don't have this glitch in LEGITIMIZE_ADDRESS hook, so
should be OK, and I verified the base address of local array in the
testcase given by Seb on pr62173 do hoisted on ppc64 now. I think
pr62173 is fixed on those 64bit arch by this patch.

Thoughts?

Thanks.

2015-04-21  Jiong Wang  

gcc/
  * loop-invariant.c (find_defs): Enable DF_DU_CHAIN build.
  (vfp_const_iv): New hash table.
  (expensive_addr_check_p): New boolean.
  (init_inv_motion_data): Initialize new variables.>
  (free_inv_motion_data): Release hash table.
  (create_new_invariant): Set cheap_address to false for iv in
  vfp_const_iv table.
  (find_invariant_insn): Skip dependencies check for iv in vfp_const_iv
  table.
  (use_for_single_du): New function.
  (reshuffle_insn_with_vfp): Likewise.
  (find_invariants_bb): Call reshuffle_insn_with_vfp.

gcc/testsuite/
   * gcc.dg/pr62173.c: New testcase.

-- 
Regards,
Jiong

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..f70dfb0 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -203,6 +203,8 @@ typedef struct invariant *invariant_p;
 /* The invariants.  */
 
 static vec invariants;
+static hash_table  > *vfp_const_iv;
+static bool need_expensive_addr_check_p;
 
 /* Check the size of the invariant table and realloc if necessary.  */
 
@@ -695,7 +697,7 @@ find_defs (struct loop *loop)
 
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
-  df_chain_add_problem (DF_UD_CHAIN);
+  df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
@@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on,
 	 See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html .  */
   inv->cheap_address = address_cost (SET_SRC (set), word_mode,
 	 ADDR_SPACE_GENERIC, speed) < 3;
+
+  if (need_expensive_addr_check_p && vfp_const_iv->find (insn))
+	inv->cheap_address = false;
 }
   else
 {
@@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed)
 return;
 
   depends_on = BITMAP_ALLOC (NULL);
-  if (!check_dependencies (insn, depends_on))
+  if (!vfp_const_iv->find (insn)
+  && !check_dependencies (insn, depends_on))
 {
   BITMAP_FREE (depends_on);
   return;
@@ -1007,6 +1013,180 @@ find_invariants_insn (rtx_insn *in

Re: [PATCH 6/13] mips musl support

2015-04-21 Thread Rich Felker
On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
> Szabolcs Nagy  writes:
> > Set up dynamic linker name for mips.
> > 
> > gcc/Changelog:
> > 
> > 2015-04-16  Gregor Richards  
> > 
> > * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.
> 
> I understand that mips musl is o32 only currently is that correct?

This is correct. Other ABIs if/when we support them will have
different names.

> There does however appear to be both soft and hard float variants
> listed in the musl docs. Do you plan on using the same dynamic linker
> name for both float variants? No problem if so but someone must have
> decided to have unique names for big and little endian so I thought
> it worth checking.

No, it's supposed to be different (-sf suffix for soft float; see
arch/mips/reloc.h in musl source). If this didn't make it into the
patches it's an omission, probably because we didn't officially
support the sf ABI at all for a long time.

> Also, are you aware of the two nan encoding formats that MIPS has
> and the support present in glibc's dynamic linker to deal with it?

I am aware but somewhat skeptical of treating it as yet another
dimension to ABI and the resulting ABI combinatorics. The vast
majority of programs couldn't care less which is which and whether a
NAN is quiet or signaling. Officially we just use the classic mips ABI
(with qnan/snan swapped vs other archs) but there's no harm in
somebody doing the opposite if they really know what they're doing.

> I wonder if it would be wise to refuse to target musl unless the
> ABI is known to be supported so that we can avoid compatibility
> issues when different ABI variants are added in musl.

Possibly, though this might make bootstrapping new ABIs harder.

Rich


Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64

2015-04-21 Thread Martin Sebor

--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,3 +1,15 @@
+2015-04-19  Martin Sebor  
+
+   PR sanitizer/65479
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace.h
+   (StackTrace::signaled, StackTrace::min_insn_bytes): New data members.
+   (StackTrace::StackTrace): Initialize signaled.
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+   (StackTrace::GetPreviousInstructionPc): Rewrite.
+   * libsanitizer/sanitizer_common/sanitizer_stacktrace_libcdep.cc
+   (StackTrace::Print): Use min_insn_bytes to adjust PC value.
+   (BufferedStackTrace::Unwind): Set signaled.


libsanitizer/ should not show up in the ChangeLog entry.
But as somebody said earlier, the libsanitizer changes really should go
to LLVM compiler-rt repo first and then be just backported, either
cherry-picked (probably the case for the 5 branch backport later on) or go in
full merge from compiler-rt.


Okay, let me submit the sanitizer changes there. Since the
tests will continue to fail without it, the libbacktrace
change can go in later if that's preferable.




--- a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
@@ -15,19 +15,33 @@

  namespace __sanitizer {

-uptr StackTrace::GetPreviousInstructionPc(uptr pc) {
-#if defined(__arm__)
-  // Cancel Thumb bit.
-  pc = pc & (~1);
-#endif


Your code loses this, which is undesirable.


The original function fails to return the pc value on ARM
so I just took it out. I didn't look into what the intent
was but all the tests pass with the patch on aarch64 (after
applying the Fedora gcc 5 patch you mentioned yesterday).




-#if defined(__powerpc__) || defined(__powerpc64__)
-  // PCs are always 4 byte aligned.
-  return pc - 4;
-#elif defined(__sparc__) || defined(__mips__)
-  return pc - 8;


The SPARC/MIPS case is of course needed, because on these architectures
the call is followed by a delay slot.  But I wonder why you need anything
special on any other architecture, why pc - 1 isn't good enough for those.
The point isn't to find a PC of the call instruction, on some targets that
is very hard and you need to disassemble, but to just find some byte in the
call instruction.


I forgot about the delay slot. Thanks for the reminder.




+const unsigned StackTrace::min_insn_bytes =
+#if defined __ia64__
+// Intel Itanium has 5 byte instructions.
+5


E.g. this is wrong, ia64 doesn't have 5 byte instructions, but has VLIW
bundles, where in the 16 byte bundle there are up to 3 41-bit instructions
plus template.  But, ia64 isn't supported by libsanitizer and I doubt there
are enough users that would be interested in writing support for a dead
architecture.


I suppose with the sanitizer output referencing the unmodified
PC values on the stack the computation can be simplified to
just subtract (and later add) 1 on all targets. Let me change
that.

Martin


Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:29:52PM +0200, Matthias Klose wrote:
> On 04/21/2015 04:19 PM, Jakub Jelinek wrote:
> > On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
> >> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> >>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
>  bump the libgcj soname on the trunk, as done for every release cycle,
> >>>
> >>> Is that really needed though these days?
> >>> Weren't there basically zero changes to libjava (both libjava and
> >>> libjava/classpath) in the last 2 or more years?
> >>> The few ones were mostly updating Copyright notices, minor configure
> >>> changes, but I really haven't seen anything ABI changing for quite a 
> >>> while.
> >>
> >> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
> >>
> >> which is defined as
> >>
> >> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
> >> dbexecdir='$(toolexeclibdir)/'$gcjsubdir
> > 
> > But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
> > (likely) provide the same ABI in the library, there is no reason not to use
> > the same directory for those.
> 
> but currently there are different directories used (gcjversion already changed
> on the trunk) and compiled into the library.  Do you mean that gcjsubdir 
> should
> be just defined as gcj?

What depends on BASE-VER sure, that is bumped automatically and should track
the gcc version.  But the soname, which is an unrelated number, there is no
point to bump it.  If you have a packaging issue, just solve it on the
packaging side, but really there is no point to yearly bump a soname of
something that doesn't change at all (and is really dead project for many
years).

Jakub


[PATCH][libstc++v3]Add new dg-require-thread-fence directive.

2015-04-21 Thread Renlin Li

Hi all,

This patch defines a new dg-require-thread-fence directive. And three 
test cases are updated to use it.


The new directive are used to check whether the target support thread 
fence either by the target back-end or external library function call. A 
thread fence is required to expand atomic load/store.


There is a case that a call to some external __sync_synchronize will be 
emitted, and it's not implemented. You will get linking errors like 
this: undefined reference to `__sync_synchronize`. Test cases which are 
gated by this directive will be skipped if no thread fence is available. 
For example the three test cases updated here. They fail on 
arm-none-eabi target where __sync_synchronize() isn't implemented and 
target cpu has no memory_barrier.


___sync_synchronize () is used to check whether thread-fence is 
available. In GCC sync_synchronize is expanded as 
expand_mem_thread_fence (MEMMODEL_SEQ_CST).


Okay to commit?


libstdc++-v3/ChangeLog:

2015-04-21  Renlin Li  

* testsuite/lib/dg-options.exp (dg-require-thread-fence): New.
* testsuite/lib/libstdc++.exp (check_v3_target_thread_fence): New.
* testsuite/29_atomics/atomic_flag/clear/1.cc: Use it.
* testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc: Likewise.
* testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc: Likewise.
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
index 0a4219c..a6e2299 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2009-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
index 2ff740b..0655be4 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2008-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
index 6ac20c0..a867da2 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
@@ -1,4 +1,5 @@
 // { dg-options "-std=gnu++11" }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2008-2015 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp
index 38c8206..56ca896 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -115,6 +115,15 @@ proc dg-require-cmath { args } {
 return
 }
 
+proc dg-require-thread-fence { args } {
+if { ![ check_v3_target_thread_fence ] } {
+	upvar dg-do-what dg-do-what
+	set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
+	return
+}
+return
+}
+
 proc dg-require-atomic-builtins { args } {
 if { ![ check_v3_target_atomic_builtins ] } {
 	upvar dg-do-what dg-do-what
diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
index b2f7d00..9e395e2 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1221,6 +1221,62 @@ proc check_v3_target_cmath { } {
 return $et_c99_math
 }
 
+proc check_v3_target_thread_fence { } {
+global cxxflags
+global DEFAULT_CXXFLAGS
+global et_thread_fence
+
+global tool
+
+if { ![info exists et_thread_fence_target_name] } {
+	set et_thread_fence_target_name ""
+}
+
+# If the target has changed since we set the cached value, clear it.
+set current_target [current_target_name]
+if { $current_target != $et_thread_fence_target_name } {
+	verbose "check_v3_target_thread_fence: `$et_thread_fence_target_name'" 2
+	set et_thread_fence_target_name $current_target
+	if [info exists et_thread_fence] {
+	verbose "check_v3_target_thread_fence: removing cached result" 2
+	unset et_thread_fence
+	}
+}
+
+if [info exists et_thread_fence] {
+	verbose "check_v3_target_thread_fence: using cached result" 2
+} else {
+	set et_thread_fence 0
+
+	# Set up and preprocess a C++11 test program that depends
+	# on the thread fence to be available.
+	set src thread_fence[pid].cc
+
+	set f [open $src "w"]
+	puts $f "int main() {"
+	puts $f "__sync_synchronize ();"
+	puts $f "return 0;"
+	puts $f "}"
+	close $f
+
+	set cxxflags_saved $cxxflags
+	set cxxflags "$cxxflags $DEFAULT_CXXFLAGS -Werror -std=gnu++11"
+
+	set lines [v3_target_compile $src /dev/null executable ""]
+	set cxxflags $cxxflag

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
On 04/21/2015 04:19 PM, Jakub Jelinek wrote:
> On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
>> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
>>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
 bump the libgcj soname on the trunk, as done for every release cycle,
>>>
>>> Is that really needed though these days?
>>> Weren't there basically zero changes to libjava (both libjava and
>>> libjava/classpath) in the last 2 or more years?
>>> The few ones were mostly updating Copyright notices, minor configure
>>> changes, but I really haven't seen anything ABI changing for quite a while.
>>
>> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
>>
>> which is defined as
>>
>> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
>> dbexecdir='$(toolexeclibdir)/'$gcjsubdir
> 
> But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
> (likely) provide the same ABI in the library, there is no reason not to use
> the same directory for those.

but currently there are different directories used (gcjversion already changed
on the trunk) and compiled into the library.  Do you mean that gcjsubdir should
be just defined as gcj?

Matthias



Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Richard Biener
On Tue, Apr 21, 2015 at 3:24 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-04-21  Trevor Saunders  
>
> * conditions.h: Define macros even if HAVE_cc0 is undefined.
> * emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
> * final.c: Likewise.
> * jump.c: Likewise.
> * recog.c: Likewise.
> * recog.h: Declare functions even when HAVE_cc0 is undefined.
> * sched-deps.c (sched_analyze_2): Always compile case for cc0.
> ---
>  gcc/conditions.h | 6 --
>  gcc/emit-rtl.c   | 2 --
>  gcc/final.c  | 2 --
>  gcc/jump.c   | 3 ---
>  gcc/recog.c  | 2 --
>  gcc/recog.h  | 2 --
>  gcc/sched-deps.c | 5 +++--
>  7 files changed, 3 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/conditions.h b/gcc/conditions.h
> index 2308bfc..7cd1e1c 100644
> --- a/gcc/conditions.h
> +++ b/gcc/conditions.h
> @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef GCC_CONDITIONS_H
>  #define GCC_CONDITIONS_H
>
> -/* None of the things in the files exist if we don't use CC0.  */
> -
> -#ifdef HAVE_cc0
> -
>  /* The variable cc_status says how to interpret the condition code.
> It is set by output routines for an instruction that sets the cc's
> and examined by output routines for jump instructions.
> @@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
>   (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
>CC_STATUS_MDEP_INIT)
>
> -#endif
> -
>  #endif /* GCC_CONDITIONS_H */
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 483eacb..c1974bb 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
>return insn;
>  }
>
> -#ifdef HAVE_cc0
>  /* Return the next insn that uses CC0 after INSN, which is assumed to
> set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
> applied to the result of this function should yield INSN).
> @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
>
>return insn;
>  }
> -#endif
>
>  #ifdef AUTO_INC_DEC
>  /* Find a RTX_AUTOINC class rtx which matches DATA.  */
> diff --git a/gcc/final.c b/gcc/final.c
> index 1fa93d9..41f6bd9 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
>
>  static int insn_counter = 0;
>
> -#ifdef HAVE_cc0
>  /* This variable contains machine-dependent flags (defined in tm.h)
> set and examined by output routines
> that describe how to interpret the condition codes properly.  */
> @@ -202,7 +201,6 @@ CC_STATUS cc_status;
> from before the insn.  */
>
>  CC_STATUS cc_prev_status;
> -#endif
>
>  /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
>
> diff --git a/gcc/jump.c b/gcc/jump.c
> index 34b3b7b..bc91550 100644
> --- a/gcc/jump.c
> +++ b/gcc/jump.c
> @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
>   && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn)));
>  }
>
> -#ifdef HAVE_cc0
> -
>  /* Return nonzero if X is an RTX that only sets the condition codes
> and has no side effects.  */
>
> @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
>  }
>return 0;
>  }
> -#endif
>
>  /* Find all CODE_LABELs referred to in X, and increment their use
> counts.  If INSN is a JUMP_INSN and there is at least one
> diff --git a/gcc/recog.c b/gcc/recog.c
> index a9d3b1f..c3ad86f 100644
> --- a/gcc/recog.c
> +++ b/gcc/recog.c
> @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
>return ((num_changes_pending () > 0) && (apply_change_group () > 0));
>  }
>
> -#ifdef HAVE_cc0
>  /* Return 1 if the insn using CC0 set by INSN does not contain
> any ordered tests applied to the condition codes.
> EQ and NE tests do not count.  */
> @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
>return (INSN_P (next)
>   && ! inequality_comparisons_p (PATTERN (next)));
>  }
> -#endif
>
>  /* Return 1 if OP is a valid general operand for machine mode MODE.
> This is either a register reference, a memory reference,
> diff --git a/gcc/recog.h b/gcc/recog.h
> index 45ea671..8a38b26 100644
> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
>  extern void validate_replace_src_group (rtx, rtx, rtx);
>  extern bool validate_simplify_insn (rtx insn);
>  extern int num_changes_pending (void);
> -#ifdef HAVE_cc0
>  extern int next_insn_tests_no_inequality (rtx);
> -#endif
>  extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
>
>  extern int offsettable_memref_p (rtx);
> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
> index 5434831..31de6be 100644
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, 
> rtx_insn *insn)
>
>return;
>
> -#ifdef HAVE_cc0
>  case CC0:
> +#ifdef HAVE_cc0

#ifndef ?

> +  gcc_unreachable ();
> +#endif
>/* Us

Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote:
> On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> >> bump the libgcj soname on the trunk, as done for every release cycle,
> > 
> > Is that really needed though these days?
> > Weren't there basically zero changes to libjava (both libjava and
> > libjava/classpath) in the last 2 or more years?
> > The few ones were mostly updating Copyright notices, minor configure
> > changes, but I really haven't seen anything ABI changing for quite a while.
> 
> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR
> 
> which is defined as
> 
> gcjsubdir=gcj-$gcjversion-$libgcj_soversion
> dbexecdir='$(toolexeclibdir)/'$gcjsubdir

But why is that an argument for bumping it?  If both GCC 5 and GCC 6 will
(likely) provide the same ABI in the library, there is no reason not to use
the same directory for those.

Jakub


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-21 Thread Manuel López-Ibáñez

On 21/04/15 13:16, Marek Polacek wrote:

(-Wlogical-op still isn't enabled neither by -Wall nor by -Wextra.)


The reason is https://gcc.gnu.org/PR61534

which means we don't want to warn for:

extern int xxx;
#define XXX xxx
int test (void)
{
  if (!XXX && xxx)
return 4;
  else
return 0;
}

(gcc/testsuite/gcc.dg/pr40172-3.c, although it should be moved to c-c++-common)

As noted in the PR: The problem is that !XXX becomes XXX == 0, but it has the 
location of "!", which is not virtual. If we look at the argument of the 
expression, then XXX is actually a var_decl, whose location corresponds to the 
declaration and not the use, and it is not virtual either. This is PR43486.




Bootstrapped/regtested on x86_64-linux, ok for trunk?


Does it pass bootstrap if you enable it? That is, is GCC itself -Wlogical-op 
clean?

Cheers,

Manuel.


Re: [PATCH 3/13] aarch64 musl support

2015-04-21 Thread pinskia




> On Apr 20, 2015, at 11:52 AM, Szabolcs Nagy  wrote:
> 
> Set up dynamic linker name for aarch64.
> 
> gcc/Changelog:
> 
> 2015-04-16  Gregor Richards  
>Szabolcs Nagy  
> 
>* config/aarch64/aarch64-linux.h (MUSL_DYNAMIC_LINKER): Define.


I don't think you need to check if defaulting to little or big-endian here are 
the specs always have one or the other passing through. 

Also if musl does not support ilp32, you might want to error out. Or even 
define the dynamic linker name even before support goes into musl. 

Thanks,
Andrew

> <03-aarch64.patch>


Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
On 04/21/2015 04:11 PM, Jakub Jelinek wrote:
> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
>> bump the libgcj soname on the trunk, as done for every release cycle,
> 
> Is that really needed though these days?
> Weren't there basically zero changes to libjava (both libjava and
> libjava/classpath) in the last 2 or more years?
> The few ones were mostly updating Copyright notices, minor configure
> changes, but I really haven't seen anything ABI changing for quite a while.

yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR

which is defined as

gcjsubdir=gcj-$gcjversion-$libgcj_soversion
dbexecdir='$(toolexeclibdir)/'$gcjsubdir



Re: [patch] [java] bump libgcj soname

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote:
> bump the libgcj soname on the trunk, as done for every release cycle,

Is that really needed though these days?
Weren't there basically zero changes to libjava (both libjava and
libjava/classpath) in the last 2 or more years?
The few ones were mostly updating Copyright notices, minor configure
changes, but I really haven't seen anything ABI changing for quite a while.

Jakub


Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine

2015-04-21 Thread Kyrill Tkachov


On 21/04/15 15:06, Jeff Law wrote:

On 04/21/2015 03:18 AM, Kyrill Tkachov wrote:


Though I do wonder if, in practice, we can identify those cases that do
simplify more directly apriori and just punt everything else rather than
this rather convoluted approach.

You mean like calling simplify_binary_operation that returns NULL
if no simplification is possible?

Not entirely sure, just a general sense that we're doing far more work
here than is justified by the potential gains.  The cases we care about
are very limited (negated or duplicated arguments) and I'd be surprised
if they're still showing up in combine.c these days.  I didn't look at
the history of that code, but I suspect it is *very very* old.


I had a look when I was writing that patch and it was
from 2005 (r96681).



I'm not asking you to tackle this problem, it was more meant as an
observation.  But if you want to dig deeper, go for it.  If it were me,
the first thing I'd do is try to construct a testcase that would get me
into that code -- I'd be it's hard, particularly with the tree and rtl
reassociations we do these days.


Yeah, the comment does mention that it's supposed to
trigger rarely. I'm looking at it from the perspective
of cleaning up rtx cost usages though.

Thanks,
Kyrill




Jeff





Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-21 Thread Jeff Law

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:


 From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.
No simulator, no machines in the testfarm, the box I had access to via 
parisc-linux.org seems dead and my ancient PA overheats well before a 
bootstrap could complete.  I often regret knowing about the backwards 
way many things were done on the PA because it makes me think about 
cases that only matter on dead architectures.



Jeff



[patch] [java] bump libgcj soname

2015-04-21 Thread Matthias Klose
bump the libgcj soname on the trunk, as done for every release cycle, and update
the cygwin/mingw32 files.

ok for the trunk?

  Matthias


gcc/

2015-04-21  Matthias Klose  

	* config/i386/cygwin.h (LIBGCJ_SONAME): Set libgcj version to -17.
	* config/i386/mingw32.h (LIBGCJ_SONAME): Set libgcj version to -17.

libjava/

2015-04-21  Matthias Klose  

	* libtool-version: Bump soversion.

Index: gcc/config/i386/cygwin.h
===
--- gcc/config/i386/cygwin.h	(revision 68)
+++ gcc/config/i386/cygwin.h	(working copy)
@@ -154,5 +154,5 @@
 #define LIBGCC_SONAME "cyggcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-16.dll"
+#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-17.dll"
 
Index: gcc/config/i386/mingw32.h
===
--- gcc/config/i386/mingw32.h	(revision 68)
+++ gcc/config/i386/mingw32.h	(working copy)
@@ -254,4 +254,4 @@
 #define LIBGCC_SONAME "libgcc_s" LIBGCC_EH_EXTN "-1.dll"
 
 /* We should find a way to not have to update this manually.  */
-#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-16.dll"
+#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-17.dll"
Index: libjava/libtool-version
===
--- libjava/libtool-version	(revision 68)
+++ libjava/libtool-version	(working copy)
@@ -5,4 +5,4 @@
 # Note: When changing the version here, please do also update LIBGCJ_SONAME
 # in gcc/config/i386/cygwin.h and gcc/config/i386/mingw32.h.
 # CURRENT:REVISION:AGE
-16:0:0
+17:0:0


Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine

2015-04-21 Thread Jeff Law

On 04/21/2015 03:18 AM, Kyrill Tkachov wrote:


Though I do wonder if, in practice, we can identify those cases that do
simplify more directly apriori and just punt everything else rather than
this rather convoluted approach.


You mean like calling simplify_binary_operation that returns NULL
if no simplification is possible?
Not entirely sure, just a general sense that we're doing far more work 
here than is justified by the potential gains.  The cases we care about 
are very limited (negated or duplicated arguments) and I'd be surprised 
if they're still showing up in combine.c these days.  I didn't look at 
the history of that code, but I suspect it is *very very* old.


I'm not asking you to tackle this problem, it was more meant as an 
observation.  But if you want to dig deeper, go for it.  If it were me, 
the first thing I'd do is try to construct a testcase that would get me 
into that code -- I'd be it's hard, particularly with the tree and rtl 
reassociations we do these days.



Jeff


Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jeff Law

On 04/21/2015 08:00 AM, Jakub Jelinek wrote:

On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote:

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h: New definition of EH_RETURN_DATA_REGNO.
* except.c: Remove definition of EH_RETURN_DATA_REGNO.
* builtins.c (expand_builtin): Remove check if
EH_RETURN_DATA_REGNO is defined.
* df-scan.c (df_bb_refs_collect): Likewise.
(df_get_exit_block_use_set): Likewise.
* haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* lra-lives.c (process_bb_lives): Likewise.

This one wasn't as obvious as the others, but is clearly OK once the full
loops being guarded by EH_RETURN_DATA_REGNO are examined.


Except that the bb_has_eh_pred predicate might burn CPU time for basic
blocks with many predecessors.  Though, the question is if there are any
important targets that don't define EH_RETURN_DATA_REGNO already.
Probably not since they'll blow up elsewhere (I was recently helping 
someone with a private port that didn't define EH_RETURN_DATA_REGNO) :-)

jeff


Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jakub Jelinek
On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote:
> On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:
> >From: Trevor Saunders 
> >
> >gcc/ChangeLog:
> >
> >2015-04-21  Trevor Saunders  
> >
> > * defaults.h: New definition of EH_RETURN_DATA_REGNO.
> > * except.c: Remove definition of EH_RETURN_DATA_REGNO.
> > * builtins.c (expand_builtin): Remove check if
> > EH_RETURN_DATA_REGNO is defined.
> > * df-scan.c (df_bb_refs_collect): Likewise.
> > (df_get_exit_block_use_set): Likewise.
> > * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
> > * ira-lives.c (process_bb_node_lives): Likewise.
> > * lra-lives.c (process_bb_lives): Likewise.
> This one wasn't as obvious as the others, but is clearly OK once the full
> loops being guarded by EH_RETURN_DATA_REGNO are examined.

Except that the bb_has_eh_pred predicate might burn CPU time for basic
blocks with many predecessors.  Though, the question is if there are any
important targets that don't define EH_RETURN_DATA_REGNO already.

Jakub


[PATCH][AArch64] Add branch-cost to cpu tuning information.

2015-04-21 Thread Matthew Wahab

The AArch64 backend sets BRANCH_COST to be the constant value 2 for all cpus,
meaning that the compiler thinks that branches cost the same across all cpus.

This patch reworks the handling of branch costs to allow per-cpu values to be
set. The actual value of the branch-costs is unchanged as the correct values for
will need to be decided for each core.

Tested aarch64-none-linux-gnu with gcc-check.

Ok for trunk?
Matthew

2015-05-21  Matthew Wahab  

* gcc/config/aarch64-protos.h (struct cpu_branch_cost): New.
(tune_params): Add field branch_costs.
(aarch64_branch_cost): Declare.
* gcc/config/aarch64.c (generic_branch_cost): New.
(generic_tunings): Set field cpu_branch_cost to generic_branch_cost.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(thunderx_tunings): Likewise.
(xgene1_tunings): Likewise.
(aarch64_branch_cost): Define.
* gcc/config/aarch64/aarch64.h (BRANCH_COST): Redefine.

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 8676c5c..77b01fa 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -162,12 +162,20 @@ struct cpu_vector_cost
   const int cond_not_taken_branch_cost;  /* Cost of not taken branch.  */
 };
 
+/* Branch costs.  */
+struct cpu_branch_cost
+{
+  const int predictable;/* Predictable branch or optimizing for size.  */
+  const int unpredictable;  /* Unpredictable branch or optimizing for speed.  */
+};
+
 struct tune_params
 {
   const struct cpu_cost_table *const insn_extra_cost;
   const struct cpu_addrcost_table *const addr_cost;
   const struct cpu_regmove_cost *const regmove_cost;
   const struct cpu_vector_cost *const vec_costs;
+  const struct cpu_branch_cost *const branch_costs;
   const int memmov_cost;
   const int issue_rate;
   const unsigned int fuseable_ops;
@@ -259,6 +267,8 @@ void aarch64_print_operand (FILE *, rtx, char);
 void aarch64_print_operand_address (FILE *, rtx);
 void aarch64_emit_call_insn (rtx);
 
+int aarch64_branch_cost (bool, bool);
+
 /* Initialize builtins for SIMD intrinsics.  */
 void init_aarch64_simd_builtins (void);
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 77a641e..a020316 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -339,12 +339,20 @@ static const struct cpu_vector_cost xgene1_vector_cost =
 #define AARCH64_FUSE_ADRP_LDR	(1 << 3)
 #define AARCH64_FUSE_CMP_BRANCH	(1 << 4)
 
+/* Generic costs for branch instructions.  */
+static const struct cpu_branch_cost generic_branch_cost =
+{
+  2,  /* Predictable.  */
+  2   /* Unpredictable.  */
+};
+
 static const struct tune_params generic_tunings =
 {
   &cortexa57_extra_costs,
   &generic_addrcost_table,
   &generic_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -362,6 +370,7 @@ static const struct tune_params cortexa53_tunings =
   &generic_addrcost_table,
   &cortexa53_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   2, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -380,6 +389,7 @@ static const struct tune_params cortexa57_tunings =
   &cortexa57_addrcost_table,
   &cortexa57_regmove_cost,
   &cortexa57_vector_cost,
+  &generic_branch_cost,
   4, /* memmov_cost  */
   3, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
@@ -398,6 +408,7 @@ static const struct tune_params thunderx_tunings =
   &generic_addrcost_table,
   &thunderx_regmove_cost,
   &generic_vector_cost,
+  &generic_branch_cost,
   6, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_CMP_BRANCH, /* fuseable_ops  */
@@ -415,6 +426,7 @@ static const struct tune_params xgene1_tunings =
   &xgene1_addrcost_table,
   &xgene1_regmove_cost,
   &xgene1_vector_cost,
+  &generic_branch_cost,
   6, /* memmov_cost  */
   4, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fuseable_ops  */
@@ -5361,6 +5373,19 @@ aarch64_address_cost (rtx x,
   return cost;
 }
 
+int
+aarch64_branch_cost (bool speed_p, bool predictable_p)
+{
+  /* When optimizing for speed, use the cost of unpredictable branches.  */
+  const struct cpu_branch_cost *branch_costs =
+aarch64_tune_params->branch_costs;
+
+  if (!speed_p || predictable_p)
+return branch_costs->predictable;
+  else
+return branch_costs->unpredictable;
+}
+
 /* Return true if the RTX X in mode MODE is a zero or sign extract
usable in an ADD or SUB (extended register) instruction.  */
 static bool
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index bf59e40..93a32f5 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -823,7 +823,8 @@ do {	 \
 #define TRAMPOLINE_SECTION text_section
 
 /* To start with.  */
-#define BRANCH_COST(SPEED_P, PREDICTABLE_P) 2
+#d

RE: [PATCH 6/13] mips musl support

2015-04-21 Thread Matthew Fortune
Szabolcs Nagy  writes:
> Set up dynamic linker name for mips.
> 
> gcc/Changelog:
> 
> 2015-04-16  Gregor Richards  
> 
>   * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define.

I understand that mips musl is o32 only currently is that correct?
There does however appear to be both soft and hard float variants
listed in the musl docs. Do you plan on using the same dynamic linker
name for both float variants? No problem if so but someone must have
decided to have unique names for big and little endian so I thought
it worth checking.

Also, are you aware of the two nan encoding formats that MIPS has
and the support present in glibc's dynamic linker to deal with it?

I wonder if it would be wise to refuse to target musl unless the
ABI is known to be supported so that we can avoid compatibility
issues when different ABI variants are added in musl.

Thanks,
Matthew


Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?
So I think after looking at this patchset, any changes of a similar 
nature you want to make should be considered pre-approved.  Just post 
them for archival purposes, but no need for you to wait for review as 
long as they have the same purpose and overall structure as was seen in 
these patches.


jeff



Re: [PATCH 10/12] remove more ifdefs for HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* combine.c (can_combine_p): Likewise.
(try_combine): Likewise.
(distribute_notes): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* reorg.c (try_merge_delay_insns): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.

OK.

Jeff



Re: [PATCH 05/12] make some HAVE_cc0 code always compiled

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0)
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (insn_a_feeds_b): Likewise.
(find_split_point): Likewise.
(simplify_set): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.

OK.  This is what I expected to see a lot of :-0

jeff



Re: [PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* genconfig.c (main): Always define HAVE_cc0.
* caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* cfgrtl.c (rtl_merge_blocks): Likewise.
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (do_SUBST_MODE): Likewise.
(insn_a_feeds_b): Likewise.
(combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
(find_split_point): Likewise.
(subst): Likewise.
(simplify_set): Likewise.
(distribute_notes): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
(final_scan_insn): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* optabs.c (prepare_cmp_insn): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
(find_reloads_address_1): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
(relax_delay_slots): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.

Doesn't go as far as I'd like, but it's still an improvement.

OK.

jeff



Re: [PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
that is trivially ded on non cc0 targets.
(simplify_set): Likewise.
(mark_used_regs_combine): Likewise.
* cse.c (new_basic_block): Likewise.
(fold_rtx): Likewise.
(cse_insn): Likewise.
(cse_extended_basic_block): Likewise.
(set_live_p): Likewise.
* rtlanal.c (canonicalize_condition): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.

OK.  I find myself wondering if the conditionals should look like
if (HAVE_cc0
&& (whatever))

But I doubt it makes any measurable difference.  It's something we can 
always add in the future if we feel the need to avoid the runtime checks 
for things that aren't ever going to happen on most modern targets.


jeff



Re: [PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* conditions.h: Define macros even if HAVE_cc0 is undefined.
* emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
* final.c: Likewise.
* jump.c: Likewise.
* recog.c: Likewise.
* recog.h: Declare functions even when HAVE_cc0 is undefined.
* sched-deps.c (sched_analyze_2): Always compile case for cc0.
OK.  Note for anyone else reading at home, some of the functions being 
unconditionally compiled now already had unconditional prototypes in the 
header files. So not everything needed a .h file change.


jeff



Re: [PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?
Thanks for tackling this.  It's not particular deep work, but I do think 
it'll help reduce the long term maintenance costs and make developers' 
lives easier.


Onward to the HAVE_cc0 patches :-)

Jeff

ps.  You hit a good window, my daughter was update late last night and 
is sleeping in a bit, so I've got unexpected time this morning before my 
meetings.




Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h: New definition of EH_RETURN_DATA_REGNO.
* except.c: Remove definition of EH_RETURN_DATA_REGNO.
* builtins.c (expand_builtin): Remove check if
EH_RETURN_DATA_REGNO is defined.
* df-scan.c (df_bb_refs_collect): Likewise.
(df_get_exit_block_use_set): Likewise.
* haifa-sched.c (initiate_bb_reg_pressure_info): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* lra-lives.c (process_bb_lives): Likewise.
This one wasn't as obvious as the others, but is clearly OK once the 
full loops being guarded by EH_RETURN_DATA_REGNO are examined.


Jeff



Re: [PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* alias.c (init_alias_target): Remove ifdef
* HARD_FRAME_POINTER_IS_FRAME_POINTER.
* df-scan.c (df_insn_refs_collect): Likewise.
(df_get_regular_block_artificial_uses): Likewise.
(df_get_eh_block_artificial_uses): Likewise.
(df_get_entry_block_def_set): Likewise.
(df_get_exit_block_use_set): Likewise.
* emit-rtl.c (gen_rtx_REG): Likewise.
* ira.c (ira_setup_eliminable_regset): Likewise.
* reginfo.c (init_reg_sets_1): Likewise.
* regrename.c (rename_chains): Likewise.
* reload1.c (reload): Likewise.
(eliminate_regs_in_insn): Likewise.
* resource.c (mark_referenced_resources): Likewise.
(init_resource_info): Likewise.

OK.
jeff



Re: [PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* df-scan.c (df_get_entry_block_def_set): Remove #ifdef
PIC_OFFSET_TABLE_REGNUM.

OK.
jeff



[PATCH, i386]: Some spring cleaning in i386.h

2015-04-21 Thread Uros Bizjak
Hello!

This patch redefines various hard register numbers with ones from
i386.md. Also, the patch reshuffles some defines to group them
together in a better way.

No functional changes.

2015-04-21  Uros Bizjak  

* config/i386/i386.md (ARGP_REG, FRAME_REG, BND2_REG, BND3_REG,
FIRST_PSEUDO_REG): New.
* config/i386/i386.h (STACK_POINTER_REGNUM): Define to SP_REG.
(ARG_POINTER_REGNUM): Define to ARGP_REG.
(FRAME_POINTER_REGNUM): Define to FRAME_REG.
(HARD_FRAME_POINTER_REGNUM): Define to BP_REG.
(FIRST_PSEUDO_REGISTER): Define to FIRST_PSEUDO_REG.
(FIRST_INT_REG): New.
(LAST_INT_REG): New.
(FIRST_*_REG): Define using *_REG.
(LAST_*_REG): Ditto.
(QI_REGNO_P): Define using FIRST_QU_REG and LAST_QI_REG.
(LEGACY_INT_REGNO_P): Define using FIRST_INT_REG and LAST_INT_REG.
(FIRST_FLOAT_REG): Define to FIRST_STACK_REG.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32},
committed to mainline SVN.

Uros.
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 57)
+++ config/i386/i386.h  (working copy)
@@ -957,7 +957,7 @@ extern const char *host_detect_local_cpu (int argc
eliminated during reloading in favor of either the stack or frame
pointer.  */
 
-#define FIRST_PSEUDO_REGISTER 81
+#define FIRST_PSEUDO_REGISTER FIRST_PSEUDO_REG
 
 /* Number of hardware registers that go into the DWARF-2 unwind info.
If not defined, equals FIRST_PSEUDO_REGISTER.  */
@@ -1100,7 +1100,7 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V16SImode || (MODE) == V16SFmode || (MODE) == V32HImode \
|| (MODE) == V4TImode)
 
-#define VALID_AVX512VL_128_REG_MODE(MODE)  
\
+#define VALID_AVX512VL_128_REG_MODE(MODE)  \
   ((MODE) == V2DImode || (MODE) == V2DFmode || (MODE) == V16QImode \
|| (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode)
 
@@ -1121,6 +1121,10 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V2SImode || (MODE) == SImode   \
|| (MODE) == V4HImode || (MODE) == V8QImode)
 
+#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode)
+
+#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode)
+
 #define VALID_BND_REG_MODE(MODE) \
   (TARGET_64BIT ? (MODE) == BND64mode : (MODE) == BND32mode)
 
@@ -1150,10 +1154,16 @@ extern const char *host_detect_local_cpu (int argc
|| (MODE) == V16SImode || (MODE) == V32HImode || (MODE) == V8DFmode \
|| (MODE) == V16SFmode)
 
-#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode)
+#define X87_FLOAT_MODE_P(MODE) \
+  (TARGET_80387 && ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode))
 
-#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode)
+#define SSE_FLOAT_MODE_P(MODE) \
+  ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode))
 
+#define FMA4_VEC_FLOAT_MODE_P(MODE) \
+  (TARGET_FMA4 && ((MODE) == V4SFmode || (MODE) == V2DFmode \
+ || (MODE) == V8SFmode || (MODE) == V4DFmode))
+
 /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE.  */
 
 #define HARD_REGNO_MODE_OK(REGNO, MODE)\
@@ -1198,42 +1208,46 @@ extern const char *host_detect_local_cpu (int argc
register.  The ordinary mov instructions won't work */
 /* #define PC_REGNUM  */
 
+/* Base register for access to arguments of the function.  */
+#define ARG_POINTER_REGNUM ARGP_REG
+
 /* Register to use for pushing function arguments.  */
-#define STACK_POINTER_REGNUM 7
+#define STACK_POINTER_REGNUM SP_REG
 
 /* Base register for access to local variables of the function.  */
-#define HARD_FRAME_POINTER_REGNUM 6
+#define FRAME_POINTER_REGNUM FRAME_REG
+#define HARD_FRAME_POINTER_REGNUM BP_REG
 
-/* Base register for access to local variables of the function.  */
-#define FRAME_POINTER_REGNUM 20
+#define FIRST_INT_REG AX_REG
+#define LAST_INT_REG  SP_REG
 
-/* First floating point reg */
-#define FIRST_FLOAT_REG 8
+#define FIRST_QI_REG AX_REG
+#define LAST_QI_REG  BX_REG
 
 /* First & last stack-like regs */
-#define FIRST_STACK_REG FIRST_FLOAT_REG
-#define LAST_STACK_REG (FIRST_FLOAT_REG + 7)
+#define FIRST_STACK_REG ST0_REG
+#define LAST_STACK_REG  ST7_REG
 
-#define FIRST_SSE_REG (FRAME_POINTER_REGNUM + 1)
-#define LAST_SSE_REG  (FIRST_SSE_REG + 7)
+#define FIRST_SSE_REG XMM0_REG
+#define LAST_SSE_REG  XMM7_REG
 
-#define FIRST_MMX_REG  (LAST_SSE_REG + 1)   /*29*/
-#define LAST_MMX_REG   (FIRST_MMX_REG + 7)
+#define FIRST_MMX_REG  MM0_REG
+#define LAST_MMX_REG   MM7_REG
 
-#define FIRST_REX_INT_REG  (LAST_MMX_REG + 1) /*37*/
-#define LAST_REX_INT_REG   (FIRST_REX_INT_REG + 7)
+#define FIRST_REX_INT_REG  R8_REG
+#define LAST_REX_INT_REG   R15_REG
 
-#define FIRST_REX_SSE_REG  (LAST_REX_INT_REG + 1) /*45*/
-#define LAST_REX_SSE_REG   (FIRST_REX_SSE_REG + 7)
+#define FIRST_R

Re: [PATCH 07/12] provide default for MASK_RETURN_ADDR

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (MASK_RETURN_ADDR): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
MASK_RETURN_ADDR.

OK.
jeff



Re: [PATCH 06/12] provide default for RETURN_ADDR_OFFSET

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (RETURN_ADDR_OFFSET): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
RETURN_ADDR_OFFSET.
(expand_builtin_frob_return_addr): Likewise.

OK.
jeff



Re: [PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef
INSN_REFERENCES_ARE_DELAYED.
* resource.c (mark_referenced_resources): Likewise.

OK.
Jeff



Re: [PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED

2015-04-21 Thread Jeff Law

On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_SETS_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED.
* resource.c (mark_set_resources): Likewise.

OK.
Jeff



[PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_SETS_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED.
* resource.c (mark_set_resources): Likewise.
---
 gcc/defaults.h | 4 
 gcc/reorg.c| 4 
 gcc/resource.c | 2 --
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 843d7e2..79cb599 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1201,6 +1201,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 #define DEFAULT_PCC_STRUCT_RETURN 1
 #endif
 
+#ifndef INSN_SETS_ARE_DELAYED
+#define INSN_SETS_ARE_DELAYED(INSN) false
+#endif
+
 #ifdef GCC_INSN_FLAGS_H
 /* Dependent default target macro definitions
 
diff --git a/gcc/reorg.c b/gcc/reorg.c
index b7228f2..ae77f0a 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -1555,10 +1555,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
 slots because it is difficult to track its resource needs
 correctly.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
  if (INSN_SETS_ARE_DELAYED (seq->insn (0)))
return 0;
-#endif
 
 #ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0)))
@@ -1657,10 +1655,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  /* If this is an INSN or JUMP_INSN with delayed effects, it
 is hard to track the resource needs properly, so give up.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
  if (INSN_SETS_ARE_DELAYED (control))
return 0;
-#endif
 
 #ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (control))
diff --git a/gcc/resource.c b/gcc/resource.c
index 9a013b3..5af9376 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -696,11 +696,9 @@ mark_set_resources (rtx x, struct resources *res, int 
in_dest,
/* An insn consisting of just a CLOBBER (or USE) is just for flow
   and doesn't actually do anything, so we ignore it.  */
 
-#ifdef INSN_SETS_ARE_DELAYED
   if (mark_type != MARK_SRC_DEST_CALL
  && INSN_SETS_ARE_DELAYED (as_a  (x)))
return;
-#endif
 
   x = PATTERN (x);
   if (GET_CODE (x) != USE && GET_CODE (x) != CLOBBER)
-- 
2.3.0.80.g18d0fec.dirty



Re: [AArch64][PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_3

2015-04-21 Thread Richard Earnshaw
On 18/04/15 19:17, Maxim Kuvyrkov wrote:
>> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw 
>>  wrote:
>>
>> On 18/04/15 16:13, Jakub Jelinek wrote:
>>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
 You need to ensure that your scratch register cannot overlap op1, since
 the scratch is written before op1 is read.
>>>
>>> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
>>> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
>>>
>>> incremental diff should ensure that, right?
>>>
>>> Jakub
>>>
>>
>>
>> Sorry, where in the patch is that hunk?
>>
>> I see just:
>>
>> +   (clobber (match_scratch:QI 3 "=X,w,X"))]
> 
> Jakub's suggestion is an incremental patch on top of Kugan's.
> 

Ah, sorry, I though he was implying it was already in the patch somewhere.

>>
>> And why would early clobbering the scratch be notably better than the
>> original?
>>
> 
> It will still be better.  With this patch we want to allow RA freedom to 
> optimally handle both of the following cases:
> 
> 1. operand[1] dies after the instruction.  In this case we want operand[0] 
> and operand[1] to be assigned to the same reg, and operand[3] to be assigned 
> to a different register to provide a temporary.  In this case we don't care 
> whether operand[3] is early-clobber or not.  This case is not optimally 
> handled with current insn patterns.
> 
> 2. operand[1] lives on after the instruction.  In this case we want 
> operand[0] and operand[3] to be assigned to the same reg, and not clobber 
> operand[1].  By marking operand[3] early-clobber we ensure that operand[1] is 
> in a different register from what operand[0] and operand[3] were assigned to. 
>  This case should be handled equally well before and after the patch.
> 
> My understanding is that Kugan's patch with Jakub's fix on top satisfy both 
> of these cases.
>  

I still don't think it handles all cases efficiently.  If we really want
the result in a different register from both of the inputs, then now we
need two registers for the results, one for the result and another for
the temporary.  In that case we could have used the result register as
the scratch, but now we can't.

Maybe we can provide two alternatives, one that early-clobbers the
result register but doesn't need a scratch and one that doesn't
early-clobber the result, but does need a scratch.

So something like

(define_insn "aarch64_lshr_sisd_or_int_3"
  [(set (match_operand:GPI 0 "register_operand" "=w,&w,w,r")
 (lshiftrt:GPI
   (match_operand:GPI 1 "register_operand" "w,w,w,r")
   (match_operand:QI 2 "aarch64_reg_or_shift_imm_"
  "Us,w,w,rUs")))
   (clobber (match_scratch:QI 3 "=X,X,w,X"))]

... but I haven't tested any of that.

I would also note the conversation in
https://gcc.gnu.org/ml/gcc/2015-04/msg00240.html.  That seems to suggest
we should be wary of using scratch sequences since the register
allocator doesn't account for them properly.

R.

> --
> Maxim Kuvyrkov
> www.linaro.org
> 



[PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition.
* reorg.c (redundant_insn): Remove ifdef
INSN_REFERENCES_ARE_DELAYED.
* resource.c (mark_referenced_resources): Likewise.
---
 gcc/defaults.h | 4 
 gcc/reorg.c| 4 
 gcc/resource.c | 2 --
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 79cb599..cafcb1e 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1205,6 +1205,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 #define INSN_SETS_ARE_DELAYED(INSN) false
 #endif
 
+#ifndef INSN_REFERENCES_ARE_DELAYED
+#define INSN_REFERENCES_ARE_DELAYED(INSN) false
+#endif
+
 #ifdef GCC_INSN_FLAGS_H
 /* Dependent default target macro definitions
 
diff --git a/gcc/reorg.c b/gcc/reorg.c
index ae77f0a..d8d8ab69 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -1558,10 +1558,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  if (INSN_SETS_ARE_DELAYED (seq->insn (0)))
return 0;
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0)))
return 0;
-#endif
 
  /* See if any of the insns in the delay slot match, updating
 resource requirements as we go.  */
@@ -1658,10 +1656,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx 
delay_list)
  if (INSN_SETS_ARE_DELAYED (control))
return 0;
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
  if (INSN_REFERENCES_ARE_DELAYED (control))
return 0;
-#endif
 
  if (JUMP_P (control))
annul_p = INSN_ANNULLED_BRANCH_P (control);
diff --git a/gcc/resource.c b/gcc/resource.c
index 5af9376..26d9fca 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -392,11 +392,9 @@ mark_referenced_resources (rtx x, struct resources *res,
  include_delayed_effects
  ? MARK_SRC_DEST_CALL : MARK_SRC_DEST);
 
-#ifdef INSN_REFERENCES_ARE_DELAYED
   if (! include_delayed_effects
  && INSN_REFERENCES_ARE_DELAYED (as_a  (x)))
return;
-#endif
 
   /* No special processing, just speed up.  */
   mark_referenced_resources (PATTERN (x), res, include_delayed_effects);
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* df-scan.c (df_get_entry_block_def_set): Remove #ifdef
PIC_OFFSET_TABLE_REGNUM.
---
 gcc/df-scan.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 69332a8..4232ec8 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3589,10 +3589,6 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
   /* These registers are live everywhere.  */
   if (!reload_completed)
 {
-#ifdef PIC_OFFSET_TABLE_REGNUM
-  unsigned int picreg = PIC_OFFSET_TABLE_REGNUM;
-#endif
-
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   /* Pseudos with argument area equivalences may require
 reloading via the argument pointer.  */
@@ -3600,13 +3596,12 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
bitmap_set_bit (entry_block_defs, ARG_POINTER_REGNUM);
 #endif
 
-#ifdef PIC_OFFSET_TABLE_REGNUM
   /* Any constant, or pseudo with constant equivalences, may
 require reloading from memory using the pic register.  */
+  unsigned int picreg = PIC_OFFSET_TABLE_REGNUM;
   if (picreg != INVALID_REGNUM
  && fixed_regs[picreg])
bitmap_set_bit (entry_block_defs, picreg);
-#endif
 }
 
 #ifdef INCOMING_RETURN_ADDR_RTX
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 10/12] remove more ifdefs for HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* combine.c (can_combine_p): Likewise.
(try_combine): Likewise.
(distribute_notes): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* reorg.c (try_merge_delay_insns): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
---
 gcc/caller-save.c |  4 +---
 gcc/cfgcleanup.c  | 26 --
 gcc/combine.c | 54 +-
 gcc/df-problems.c |  5 +
 gcc/final.c   | 29 ++---
 gcc/gcse.c| 24 +---
 gcc/ira.c |  5 +
 gcc/reorg.c   | 26 +++---
 gcc/sched-deps.c  |  6 +++---
 9 files changed, 69 insertions(+), 110 deletions(-)

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index fc575eb..76c3a7e 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -1400,18 +1400,16 @@ insert_one_insn (struct insn_chain *chain, int 
before_p, int code, rtx pat)
   rtx_insn *insn = chain->insn;
   struct insn_chain *new_chain;
 
-#if HAVE_cc0
   /* If INSN references CC0, put our insns in front of the insn that sets
  CC0.  This is always safe, since the only way we could be passed an
  insn that references CC0 is for a restore, and doing a restore earlier
  isn't a problem.  We do, however, assume here that CALL_INSNs don't
  reference CC0.  Guard against non-INSN's like CODE_LABEL.  */
 
-  if ((NONJUMP_INSN_P (insn) || JUMP_P (insn))
+  if (HAVE_cc0 && (NONJUMP_INSN_P (insn) || JUMP_P (insn))
   && before_p
   && reg_referenced_p (cc0_rtx, PATTERN (insn)))
 chain = chain->prev, insn = chain->insn;
-#endif
 
   new_chain = new_insn_chain ();
   if (before_p)
diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 58d235e..e5c4747 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -1416,12 +1416,11 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, 
rtx_insn **f1,
   i2 = PREV_INSN (i2);
 }
 
-#if HAVE_cc0
   /* Don't allow the insn after a compare to be shared by
  cross-jumping unless the compare is also shared.  */
-  if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1))
+  if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1)
+  && ! sets_cc0_p (last1))
 last1 = afterlast1, last2 = afterlast2, last_dir = afterlast_dir, ninsns--;
-#endif
 
   /* Include preceding notes and labels in the cross-jump.  One,
  this may bring us to the head of the blocks as requested above.
@@ -1539,12 +1538,11 @@ flow_find_head_matching_sequence (basic_block bb1, 
basic_block bb2, rtx_insn **f
   i2 = NEXT_INSN (i2);
 }
 
-#if HAVE_cc0
   /* Don't allow a compare to be shared by cross-jumping unless the insn
  after the compare is also shared.  */
-  if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1))
+  if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1)
+  && sets_cc0_p (last1))
 last1 = beforelast1, last2 = beforelast2, ninsns--;
-#endif
 
   if (ninsns)
 {
@@ -2330,11 +2328,9 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
 {
-#if HAVE_cc0
-  if (reg_mentioned_p (cc0_rtx, jump))
+  if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
   else
-#endif
move_before = jump;
 }
 
@@ -2499,11 +2495,9 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
{
-#if HAVE_cc0
- if (reg_mentioned_p (cc0_rtx, jump))
+ if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
  else
-#endif
move_before = jump;
}
 }
@@ -2522,12 +2516,10 @@ try_head_merge_bb (basic_block bb)
  /* Try again, using a different insertion point.  */
  move_before = jump;
 
-#if HAVE_cc0
  /* Don't try moving before a cc0 user, as that may invalidate
 the cc0.  */
- if (reg_mentioned_p (cc0_rtx, jump))
+ if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump))
break;
-#endif
 
  continue;
}
@@ -2582,12 +2574,10 @@ try_head_merge_bb (basic_block bb)
  /* For the unmerged insns, try a different insertion point.  */
  move_before = jump;
 
-#if HAVE_cc0
  /* Don't try moving before a cc0 user,

[PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* alias.c (init_alias_target): Remove ifdef
* HARD_FRAME_POINTER_IS_FRAME_POINTER.
* df-scan.c (df_insn_refs_collect): Likewise.
(df_get_regular_block_artificial_uses): Likewise.
(df_get_eh_block_artificial_uses): Likewise.
(df_get_entry_block_def_set): Likewise.
(df_get_exit_block_use_set): Likewise.
* emit-rtl.c (gen_rtx_REG): Likewise.
* ira.c (ira_setup_eliminable_regset): Likewise.
* reginfo.c (init_reg_sets_1): Likewise.
* regrename.c (rename_chains): Likewise.
* reload1.c (reload): Likewise.
(eliminate_regs_in_insn): Likewise.
* resource.c (mark_referenced_resources): Likewise.
(init_resource_info): Likewise.
---
 gcc/alias.c |  7 +++
 gcc/df-scan.c   | 35 +--
 gcc/emit-rtl.c  |  6 +++---
 gcc/ira.c   | 23 ---
 gcc/reginfo.c   |  5 ++---
 gcc/regrename.c |  5 ++---
 gcc/reload1.c   | 10 --
 gcc/resource.c  | 11 +--
 8 files changed, 48 insertions(+), 54 deletions(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index a7160f3..8f48660 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2765,10 +2765,9 @@ init_alias_target (void)
 = unique_base_value (UNIQUE_BASE_VALUE_ARGP);
   static_reg_base_value[FRAME_POINTER_REGNUM]
 = unique_base_value (UNIQUE_BASE_VALUE_FP);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  static_reg_base_value[HARD_FRAME_POINTER_REGNUM]
-= unique_base_value (UNIQUE_BASE_VALUE_HFP);
-#endif
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+static_reg_base_value[HARD_FRAME_POINTER_REGNUM]
+  = unique_base_value (UNIQUE_BASE_VALUE_HFP);
 }
 
 /* Set MEMORY_MODIFIED when X modifies DATA (that is assumed
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index b2e2e5d..69332a8 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3247,12 +3247,11 @@ df_insn_refs_collect (struct df_collection_rec 
*collection_rec,
  regno_reg_rtx[FRAME_POINTER_REGNUM],
  NULL, bb, insn_info,
  DF_REF_REG_USE, 0);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  df_ref_record (DF_REF_BASE, collection_rec,
- regno_reg_rtx[HARD_FRAME_POINTER_REGNUM],
- NULL, bb, insn_info,
- DF_REF_REG_USE, 0);
-#endif
+ if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   df_ref_record (DF_REF_BASE, collection_rec,
+  regno_reg_rtx[HARD_FRAME_POINTER_REGNUM],
+  NULL, bb, insn_info,
+  DF_REF_REG_USE, 0);
   break;
 default:
   break;
@@ -3442,9 +3441,9 @@ df_get_regular_block_artificial_uses (bitmap 
regular_block_artificial_uses)
 reference of the frame pointer.  */
   bitmap_set_bit (regular_block_artificial_uses, FRAME_POINTER_REGNUM);
 
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  bitmap_set_bit (regular_block_artificial_uses, 
HARD_FRAME_POINTER_REGNUM);
-#endif
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   bitmap_set_bit (regular_block_artificial_uses,
+   HARD_FRAME_POINTER_REGNUM);
 
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   /* Pseudos with argument area equivalences may require
@@ -3494,9 +3493,9 @@ df_get_eh_block_artificial_uses (bitmap 
eh_block_artificial_uses)
   if (frame_pointer_needed)
{
  bitmap_set_bit (eh_block_artificial_uses, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
- bitmap_set_bit (eh_block_artificial_uses, HARD_FRAME_POINTER_REGNUM);
-#endif
+ if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   bitmap_set_bit (eh_block_artificial_uses,
+   HARD_FRAME_POINTER_REGNUM);
}
 #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   if (fixed_regs[ARG_POINTER_REGNUM])
@@ -3580,11 +3579,11 @@ df_get_entry_block_def_set (bitmap entry_block_defs)
   /* Any reference to any pseudo before reload is a potential
 reference of the frame pointer.  */
   bitmap_set_bit (entry_block_defs, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+
   /* If they are different, also mark the hard frame pointer as live.  */
-  if (!LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM))
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER
+ && !LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM))
bitmap_set_bit (entry_block_defs, HARD_FRAME_POINTER_REGNUM);
-#endif
 }
 
   /* These registers are live everywhere.  */
@@ -3718,11 +3717,11 @@ df_get_exit_block_use_set (bitmap exit_block_uses)
   if ((!reload_completed) || frame_pointer_needed)
 {
   bitmap_set_bit (exit_block_uses, FRAME_POINTER_REGNUM);
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+
   /* If they are different, also mark the hard frame pointer as live.  */
-  if (

[PATCH 07/12] provide default for MASK_RETURN_ADDR

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (MASK_RETURN_ADDR): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
MASK_RETURN_ADDR.
---
 gcc/defaults.h | 4 
 gcc/except.c   | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 767901a..843d7e2 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -388,6 +388,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define RETURN_ADDR_OFFSET 0
 #endif
 
+#ifndef MASK_RETURN_ADDR
+#define MASK_RETURN_ADDR NULL_RTX
+#endif
+
 /* If we have named section and we support weak symbols, then use the
.jcr section for recording java classes which need to be registered
at program start-up time.  */
diff --git a/gcc/except.c b/gcc/except.c
index c98163d..5b24006 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -2184,9 +2184,9 @@ expand_builtin_extract_return_addr (tree addr_tree)
 }
 
   /* First mask out any unwanted bits.  */
-#ifdef MASK_RETURN_ADDR
-  expand_and (Pmode, addr, MASK_RETURN_ADDR, addr);
-#endif
+  rtx mask = MASK_RETURN_ADDR;
+  if (mask)
+expand_and (Pmode, addr, mask, addr);
 
   /* Then adjust to find the real return address.  */
   if (RETURN_ADDR_OFFSET)
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 06/12] provide default for RETURN_ADDR_OFFSET

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* defaults.h (RETURN_ADDR_OFFSET): New definition.
* except.c (expand_builtin_extract_return_addr): Remove ifdef
RETURN_ADDR_OFFSET.
(expand_builtin_frob_return_addr): Likewise.
---
 gcc/defaults.h |  5 +
 gcc/except.c   | 14 +++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 911c2f8..767901a 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -383,6 +383,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM
 #endif
 
+/* Offset between the eh handler address and entry in eh tables.  */
+#ifndef RETURN_ADDR_OFFSET
+#define RETURN_ADDR_OFFSET 0
+#endif
+
 /* If we have named section and we support weak symbols, then use the
.jcr section for recording java classes which need to be registered
at program start-up time.  */
diff --git a/gcc/except.c b/gcc/except.c
index 7573c88..c98163d 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -2189,9 +2189,8 @@ expand_builtin_extract_return_addr (tree addr_tree)
 #endif
 
   /* Then adjust to find the real return address.  */
-#if defined (RETURN_ADDR_OFFSET)
-  addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET);
-#endif
+  if (RETURN_ADDR_OFFSET)
+addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET);
 
   return addr;
 }
@@ -2207,10 +2206,11 @@ expand_builtin_frob_return_addr (tree addr_tree)
 
   addr = convert_memory_address (Pmode, addr);
 
-#ifdef RETURN_ADDR_OFFSET
-  addr = force_reg (Pmode, addr);
-  addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET);
-#endif
+  if (RETURN_ADDR_OFFSET)
+{
+  addr = force_reg (Pmode, addr);
+  addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET);
+}
 
   return addr;
 }
-- 
2.3.0.80.g18d0fec.dirty



[PATCH 04/12] always define HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* genconfig.c (main): Always define HAVE_cc0.
* caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if
HAVE_cc0.
* cfgcleanup.c (flow_find_cross_jump): Likewise.
(flow_find_head_matching_sequence): Likewise.
(try_head_merge_bb): Likewise.
* cfgrtl.c (rtl_merge_blocks): Likewise.
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (do_SUBST_MODE): Likewise.
(insn_a_feeds_b): Likewise.
(combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
(find_split_point): Likewise.
(subst): Likewise.
(simplify_set): Likewise.
(distribute_notes): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* final.c (final): Likewise.
(final_scan_insn): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* gcse.c (insert_insn_end_basic_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* optabs.c (prepare_cmp_insn): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
(find_reloads_address_1): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
(relax_delay_slots): Likewise.
* sched-deps.c (sched_analyze_2): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.
---
 gcc/caller-save.c |  2 +-
 gcc/cfgcleanup.c  | 12 ++--
 gcc/cfgrtl.c  |  6 +++---
 gcc/combine.c | 36 ++--
 gcc/cprop.c   |  2 +-
 gcc/cse.c |  2 +-
 gcc/df-problems.c |  4 ++--
 gcc/final.c   | 14 +++---
 gcc/function.c|  2 +-
 gcc/gcse.c|  2 +-
 gcc/genconfig.c   |  1 +
 gcc/haifa-sched.c |  2 +-
 gcc/ira.c |  4 ++--
 gcc/loop-invariant.c  |  2 +-
 gcc/lra-constraints.c |  2 +-
 gcc/optabs.c  |  2 +-
 gcc/postreload.c  |  2 +-
 gcc/reload.c  |  6 +++---
 gcc/reorg.c   | 30 +++---
 gcc/sched-deps.c  |  2 +-
 gcc/sched-rgn.c   |  2 +-
 21 files changed, 69 insertions(+), 68 deletions(-)

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index 3b01941..fc575eb 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -1400,7 +1400,7 @@ insert_one_insn (struct insn_chain *chain, int before_p, 
int code, rtx pat)
   rtx_insn *insn = chain->insn;
   struct insn_chain *new_chain;
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* If INSN references CC0, put our insns in front of the insn that sets
  CC0.  This is always safe, since the only way we could be passed an
  insn that references CC0 is for a restore, and doing a restore earlier
diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index cee152e..58d235e 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -1416,7 +1416,7 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, 
rtx_insn **f1,
   i2 = PREV_INSN (i2);
 }
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* Don't allow the insn after a compare to be shared by
  cross-jumping unless the compare is also shared.  */
   if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1))
@@ -1539,7 +1539,7 @@ flow_find_head_matching_sequence (basic_block bb1, 
basic_block bb2, rtx_insn **f
   i2 = NEXT_INSN (i2);
 }
 
-#ifdef HAVE_cc0
+#if HAVE_cc0
   /* Don't allow a compare to be shared by cross-jumping unless the insn
  after the compare is also shared.  */
   if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1))
@@ -2330,7 +2330,7 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
 {
-#ifdef HAVE_cc0
+#if HAVE_cc0
   if (reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
   else
@@ -2499,7 +2499,7 @@ try_head_merge_bb (basic_block bb)
   cond = get_condition (jump, &move_before, true, false);
   if (cond == NULL_RTX)
{
-#ifdef HAVE_cc0
+#if HAVE_cc0
  if (reg_mentioned_p (cc0_rtx, jump))
move_before = prev_nonnote_nondebug_insn (jump);
  else
@@ -2522,7 +2522,7 @@ try_head_merge_bb

[PATCH 03/12] more removal of ifdef HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* combine.c (find_single_use): Remove HAVE_cc0 ifdef for code
that is trivially ded on non cc0 targets.
(simplify_set): Likewise.
(mark_used_regs_combine): Likewise.
* cse.c (new_basic_block): Likewise.
(fold_rtx): Likewise.
(cse_insn): Likewise.
(cse_extended_basic_block): Likewise.
(set_live_p): Likewise.
* rtlanal.c (canonicalize_condition): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.
---
 gcc/combine.c  |  6 --
 gcc/cse.c  | 18 --
 gcc/rtlanal.c  |  2 --
 gcc/simplify-rtx.c |  5 ++---
 4 files changed, 2 insertions(+), 29 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 46cd6db..0a35b8f 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -686,7 +686,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
   rtx *result;
   struct insn_link *link;
 
-#ifdef HAVE_cc0
   if (dest == cc0_rtx)
 {
   next = NEXT_INSN (insn);
@@ -699,7 +698,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
*ploc = next;
   return result;
 }
-#endif
 
   if (!REG_P (dest))
 return 0;
@@ -6724,7 +6722,6 @@ simplify_set (rtx x)
   src = SET_SRC (x), dest = SET_DEST (x);
 }
 
-#ifdef HAVE_cc0
   /* If we have (set (cc0) (subreg ...)), we try to remove the subreg
  in SRC.  */
   if (dest == cc0_rtx
@@ -6744,7 +6741,6 @@ simplify_set (rtx x)
  src = SET_SRC (x);
}
 }
-#endif
 
 #ifdef LOAD_EXTEND_OP
   /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this
@@ -13193,11 +13189,9 @@ mark_used_regs_combine (rtx x)
 case ADDR_VEC:
 case ADDR_DIFF_VEC:
 case ASM_INPUT:
-#ifdef HAVE_cc0
 /* CC0 must die in the insn after it is set, so we don't need to take
special note of it here.  */
 case CC0:
-#endif
   return;
 
 case CLOBBER:
diff --git a/gcc/cse.c b/gcc/cse.c
index 2a33827..d184d27 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -281,7 +281,6 @@ struct qty_table_elem
 /* The table of all qtys, indexed by qty number.  */
 static struct qty_table_elem *qty_table;
 
-#ifdef HAVE_cc0
 /* For machines that have a CC0, we do not record its value in the hash
table since its use is guaranteed to be the insn immediately following
its definition and any other insn is presumed to invalidate it.
@@ -293,7 +292,6 @@ static struct qty_table_elem *qty_table;
 
 static rtx this_insn_cc0, prev_insn_cc0;
 static machine_mode this_insn_cc0_mode, prev_insn_cc0_mode;
-#endif
 
 /* Insn being scanned.  */
 
@@ -884,9 +882,7 @@ new_basic_block (void)
}
 }
 
-#ifdef HAVE_cc0
   prev_insn_cc0 = 0;
-#endif
 }
 
 /* Say that register REG contains a quantity in mode MODE not in any
@@ -3166,10 +3162,8 @@ fold_rtx (rtx x, rtx_insn *insn)
 case EXPR_LIST:
   return x;
 
-#ifdef HAVE_cc0
 case CC0:
   return prev_insn_cc0;
-#endif
 
 case ASM_OPERANDS:
   if (insn)
@@ -3223,7 +3217,6 @@ fold_rtx (rtx x, rtx_insn *insn)
const_arg = folded_arg;
break;
 
-#ifdef HAVE_cc0
  case CC0:
/* The cc0-user and cc0-setter may be in different blocks if
   the cc0-setter potentially traps.  In that case PREV_INSN_CC0
@@ -3247,7 +3240,6 @@ fold_rtx (rtx x, rtx_insn *insn)
const_arg = equiv_constant (folded_arg);
  }
break;
-#endif
 
  default:
folded_arg = fold_rtx (folded_arg, insn);
@@ -4522,11 +4514,9 @@ cse_insn (rtx_insn *insn)
 sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
 
   this_insn = insn;
-#ifdef HAVE_cc0
   /* Records what this insn does to set CC0.  */
   this_insn_cc0 = 0;
   this_insn_cc0_mode = VOIDmode;
-#endif
 
   /* Find all regs explicitly clobbered in this insn,
  to ensure they are not replaced with any other regs
@@ -5541,7 +5531,6 @@ cse_insn (rtx_insn *insn)
}
}
 
-#ifdef HAVE_cc0
   /* If setting CC0, record what it was set to, or a constant, if it
 is equivalent to a constant.  If it is being set to a floating-point
 value, make a COMPARE with the appropriate constant of 0.  If we
@@ -5556,7 +5545,6 @@ cse_insn (rtx_insn *insn)
this_insn_cc0 = gen_rtx_COMPARE (VOIDmode, this_insn_cc0,
 CONST0_RTX (mode));
}
-#endif
 }
 
   /* Now enter all non-volatile source expressions in the hash table
@@ -6604,11 +6592,9 @@ cse_extended_basic_block (struct cse_basic_block_data 
*ebb_data)
  record_jump_equiv (insn, taken);
}
 
-#ifdef HAVE_cc0
   /* Clear the CC0-tracking related insns, they can't provide
 useful information across basic block boundaries.  */
   prev_insn_cc0 = 0;
-#endif
 }
 
   gcc_assert (next_qty <= max_qty);
@@ -6859,21 +6845,17 @@ static bool
 set_live_p (rtx set, rtx_in

[PATCH 05/12] make some HAVE_cc0 code always compiled

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0)
(try_redirect_by_replacing_jump): Likewise.
(rtl_tidy_fallthru_edge): Likewise.
* combine.c (insn_a_feeds_b): Likewise.
(find_split_point): Likewise.
(simplify_set): Likewise.
* cprop.c (cprop_jump): Likewise.
* cse.c (cse_extended_basic_block): Likewise.
* df-problems.c (can_move_insns_across): Likewise.
* function.c (emit_use_return_register_into_block): Likewise.
* haifa-sched.c (sched_init): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* loop-invariant.c (find_invariant_insn): Likewise.
* lra-constraints.c (curr_insn_transform): Likewise.
* postreload.c (reload_combine_recognize_const_pattern):
* Likewise.
* reload.c (find_reloads): Likewise.
* reorg.c (delete_scheduled_jump): Likewise.
(steal_delay_list_from_target): Likewise.
(steal_delay_list_from_fallthrough): Likewise.
(redundant_insn): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(delete_computation): Likewise.
* sched-rgn.c (add_branch_dependences): Likewise.
---
 gcc/cfgrtl.c  | 12 +++-
 gcc/combine.c | 10 ++
 gcc/cprop.c   |  4 +---
 gcc/cse.c |  4 +---
 gcc/df-problems.c |  4 +---
 gcc/function.c|  5 ++---
 gcc/haifa-sched.c |  3 +--
 gcc/ira.c |  5 ++---
 gcc/loop-invariant.c  |  4 +---
 gcc/lra-constraints.c |  6 ++
 gcc/postreload.c  |  4 +---
 gcc/reload.c  | 10 +++---
 gcc/reorg.c   | 32 
 gcc/sched-rgn.c   |  4 +---
 14 files changed, 29 insertions(+), 78 deletions(-)

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 4c1708f..d93a49e 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -893,10 +893,9 @@ rtl_merge_blocks (basic_block a, basic_block b)
 
   del_first = a_end;
 
-#if HAVE_cc0
   /* If this was a conditional jump, we need to also delete
 the insn that set cc0.  */
-  if (only_sets_cc0_p (prev))
+  if (HAVE_cc0 && only_sets_cc0_p (prev))
{
  rtx_insn *tmp = prev;
 
@@ -905,7 +904,6 @@ rtl_merge_blocks (basic_block a, basic_block b)
prev = BB_HEAD (a);
  del_first = tmp;
}
-#endif
 
   a_end = PREV_INSN (del_first);
 }
@@ -1064,11 +1062,9 @@ try_redirect_by_replacing_jump (edge e, basic_block 
target, bool in_cfglayout)
   /* In case we zap a conditional jump, we'll need to kill
  the cc0 setter too.  */
   kill_from = insn;
-#if HAVE_cc0
-  if (reg_mentioned_p (cc0_rtx, PATTERN (insn))
+  if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, PATTERN (insn))
   && only_sets_cc0_p (PREV_INSN (insn)))
 kill_from = PREV_INSN (insn);
-#endif
 
   /* See if we can create the fallthru edge.  */
   if (in_cfglayout || can_fallthru (src, target))
@@ -1825,12 +1821,10 @@ rtl_tidy_fallthru_edge (edge e)
  delete_insn (table);
}
 
-#if HAVE_cc0
   /* If this was a conditional jump, we need to also delete
 the insn that set cc0.  */
-  if (any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q)))
+  if (HAVE_cc0 && any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q)))
q = PREV_INSN (q);
-#endif
 
   q = PREV_INSN (q);
 }
diff --git a/gcc/combine.c b/gcc/combine.c
index 430084e..d71f863 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1141,10 +1141,8 @@ insn_a_feeds_b (rtx_insn *a, rtx_insn *b)
   FOR_EACH_LOG_LINK (links, b)
 if (links->insn == a)
   return true;
-#if HAVE_cc0
-  if (sets_cc0_p (a))
+  if (HAVE_cc0 && sets_cc0_p (a))
 return true;
-#endif
   return false;
 }
 
@@ -4816,7 +4814,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
   break;
 
 case SET:
-#if HAVE_cc0
   /* If SET_DEST is CC0 and SET_SRC is not an operand, a COMPARE, or a
 ZERO_EXTRACT, the most likely reason why this doesn't match is that
 we need to put the operand into a register.  So split at that
@@ -4829,7 +4826,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
  && ! (GET_CODE (SET_SRC (x)) == SUBREG
&& OBJECT_P (SUBREG_REG (SET_SRC (x)
return &SET_SRC (x);
-#endif
 
   /* See if we can split SET_SRC as it stands.  */
   split = find_split_point (&SET_SRC (x), insn, true);
@@ -6582,13 +6578,12 @@ simplify_set (rtx x)
   else
compare_mode = SELECT_CC_MODE (new_code, op0, op1);
 
-#if !HAVE_cc0
   /* If the mode changed, we have to change SET_DEST, the mode in the
 compare, and the mode in the place SET_DEST is used.  If SET_DEST is
 a hard register, just build new versions with the proper mode.  If it
 is a pseudo, we lose unless it is only time we set the pseudo, in
 

[PATCH 02/12] remove some ifdef HAVE_cc0

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-04-21  Trevor Saunders  

* conditions.h: Define macros even if HAVE_cc0 is undefined.
* emit-rtl.c: Define functions even if HAVE_cc0 is undefined.
* final.c: Likewise.
* jump.c: Likewise.
* recog.c: Likewise.
* recog.h: Declare functions even when HAVE_cc0 is undefined.
* sched-deps.c (sched_analyze_2): Always compile case for cc0.
---
 gcc/conditions.h | 6 --
 gcc/emit-rtl.c   | 2 --
 gcc/final.c  | 2 --
 gcc/jump.c   | 3 ---
 gcc/recog.c  | 2 --
 gcc/recog.h  | 2 --
 gcc/sched-deps.c | 5 +++--
 7 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/gcc/conditions.h b/gcc/conditions.h
index 2308bfc..7cd1e1c 100644
--- a/gcc/conditions.h
+++ b/gcc/conditions.h
@@ -20,10 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_CONDITIONS_H
 #define GCC_CONDITIONS_H
 
-/* None of the things in the files exist if we don't use CC0.  */
-
-#ifdef HAVE_cc0
-
 /* The variable cc_status says how to interpret the condition code.
It is set by output routines for an instruction that sets the cc's
and examined by output routines for jump instructions.
@@ -117,6 +113,4 @@ extern CC_STATUS cc_status;
  (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0,  \
   CC_STATUS_MDEP_INIT)
 
-#endif
-
 #endif /* GCC_CONDITIONS_H */
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 483eacb..c1974bb 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn)
   return insn;
 }
 
-#ifdef HAVE_cc0
 /* Return the next insn that uses CC0 after INSN, which is assumed to
set it.  This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter
applied to the result of this function should yield INSN).
@@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn)
 
   return insn;
 }
-#endif
 
 #ifdef AUTO_INC_DEC
 /* Find a RTX_AUTOINC class rtx which matches DATA.  */
diff --git a/gcc/final.c b/gcc/final.c
index 1fa93d9..41f6bd9 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0;
 
 static int insn_counter = 0;
 
-#ifdef HAVE_cc0
 /* This variable contains machine-dependent flags (defined in tm.h)
set and examined by output routines
that describe how to interpret the condition codes properly.  */
@@ -202,7 +201,6 @@ CC_STATUS cc_status;
from before the insn.  */
 
 CC_STATUS cc_prev_status;
-#endif
 
 /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen.  */
 
diff --git a/gcc/jump.c b/gcc/jump.c
index 34b3b7b..bc91550 100644
--- a/gcc/jump.c
+++ b/gcc/jump.c
@@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn)
  && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn)));
 }
 
-#ifdef HAVE_cc0
-
 /* Return nonzero if X is an RTX that only sets the condition codes
and has no side effects.  */
 
@@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x)
 }
   return 0;
 }
-#endif
 
 /* Find all CODE_LABELs referred to in X, and increment their use
counts.  If INSN is a JUMP_INSN and there is at least one
diff --git a/gcc/recog.c b/gcc/recog.c
index a9d3b1f..c3ad86f 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn)
   return ((num_changes_pending () > 0) && (apply_change_group () > 0));
 }
 
-#ifdef HAVE_cc0
 /* Return 1 if the insn using CC0 set by INSN does not contain
any ordered tests applied to the condition codes.
EQ and NE tests do not count.  */
@@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn)
   return (INSN_P (next)
  && ! inequality_comparisons_p (PATTERN (next)));
 }
-#endif
 
 /* Return 1 if OP is a valid general operand for machine mode MODE.
This is either a register reference, a memory reference,
diff --git a/gcc/recog.h b/gcc/recog.h
index 45ea671..8a38b26 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx);
 extern void validate_replace_src_group (rtx, rtx, rtx);
 extern bool validate_simplify_insn (rtx insn);
 extern int num_changes_pending (void);
-#ifdef HAVE_cc0
 extern int next_insn_tests_no_inequality (rtx);
-#endif
 extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode);
 
 extern int offsettable_memref_p (rtx);
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 5434831..31de6be 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn 
*insn)
 
   return;
 
-#ifdef HAVE_cc0
 case CC0:
+#ifdef HAVE_cc0
+  gcc_unreachable ();
+#endif
   /* User of CC0 depends on immediately preceding insn.  */
   SCHED_GROUP_P (insn) = 1;
/* Don't move CC0 setter to another block (it can set up the
@@ -2620,7 +2622,6 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn 
*insn)
sched_deps_info->finish_rhs ();
 
   return;
-#endif
 
 case R

[PATCH 00/12] Reduce conditional compilation

2015-04-21 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

This is a first round of patches to reduce the amount of code with in #if /
#ifdef.  This makes it incrementally easier to not break configs other than the
one being built, and moves things slightly closer to using target hooks for
everything.

each commit bootstrapped and regtested on x86_64-linux-gnu without regression,
and whole patch set run through config-list.mk without issue, ok?

Trevor Saunders (12):
  add default definition of EH_RETURN_DATA_REGNO
  remove some ifdef HAVE_cc0
  more HAVE_cc0
  always define HAVE_cc0
  make some HAVE_cc0 code always compiled
  provide default for RETURN_ADDR_OFFSET
  provide default for MASK_RETURN_ADDR
  reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER
  remove #if for PIC_OFFSET_TABLE_REGNUM
  remove more ifdefs for HAVE_cc0
  provide default for INSN_SETS_ARE_DELAYED
  add default for INSN_REFERENCES_ARE_DELAYED

 gcc/alias.c   |  7 ++---
 gcc/builtins.c|  2 --
 gcc/caller-save.c |  4 +--
 gcc/cfgcleanup.c  | 26 +---
 gcc/cfgrtl.c  | 12 ++--
 gcc/combine.c | 84 ++-
 gcc/conditions.h  |  6 
 gcc/cprop.c   |  4 +--
 gcc/cse.c | 22 +-
 gcc/defaults.h| 23 ++
 gcc/df-problems.c |  9 ++
 gcc/df-scan.c | 46 +++-
 gcc/emit-rtl.c|  8 ++---
 gcc/except.c  | 26 ++--
 gcc/final.c   | 43 --
 gcc/function.c|  5 ++-
 gcc/gcse.c| 24 ---
 gcc/genconfig.c   |  1 +
 gcc/haifa-sched.c |  5 +--
 gcc/ira-lives.c   |  2 --
 gcc/ira.c | 33 +---
 gcc/jump.c|  3 --
 gcc/loop-invariant.c  |  4 +--
 gcc/lra-constraints.c |  6 ++--
 gcc/lra-lives.c   |  2 --
 gcc/optabs.c  |  2 +-
 gcc/postreload.c  |  4 +--
 gcc/recog.c   |  2 --
 gcc/recog.h   |  2 --
 gcc/reginfo.c |  5 ++-
 gcc/regrename.c   |  5 ++-
 gcc/reload.c  | 12 +++-
 gcc/reload1.c | 10 +++---
 gcc/reorg.c   | 68 ++---
 gcc/resource.c| 15 +++--
 gcc/rtlanal.c |  2 --
 gcc/sched-deps.c  |  5 +--
 gcc/sched-rgn.c   |  4 +--
 gcc/simplify-rtx.c|  5 ++-
 39 files changed, 199 insertions(+), 349 deletions(-)

-- 
2.3.0.80.g18d0fec.dirty



  1   2   >