Re: [PATCH] [4/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Jan Beulich
>>> On 17.07.15 at 06:26,  wrote:
> Which imported packages use configure.in?  I'm happy to submit patches
> for those, too.

The answer to this may not even matter - consuming components
(like gcc is in respect to binutils) shouldn't assume only the newer
name is used: It should remain to be possible to build with older
versions. I.e. you always have to check for both .ac and .in when
looking for a file.

Jan



[PATCH, rtl-optimization]: Fix PR 66891, ICE in expand_call, at calls.c

2015-07-16 Thread Uros Bizjak
Hello!

When moving precompute_register_parameters, I didn't notice that the
call was wrapped with NO_DEFER_POP/OK_DEFER_POP.

Attached patch fixes this oversight.

2015-07-17  Uros Bizjak  

PR rtl-optimization/66891
* calls.c (expand_call): Wrap precompute_register_parameters with
NO_DEFER_POP/OK_DEFER_POP to prevent deferred pops.

testsuite/ChangeLog:

2015-07-17  Uros Bizjak  

PR target/66891
* gcc.target/i386/pr66891.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32},
committed to mainline SVN as obvious.

Uros.
Index: calls.c
===
--- calls.c (revision 225857)
+++ calls.c (working copy)
@@ -3144,6 +3144,10 @@ expand_call (tree exp, rtx target, int ignore)
 
   compute_argument_addresses (args, argblock, num_actuals);
 
+  /* Stack is properly aligned, pops can't safely be deferred during
+the evaluation of the arguments.  */
+  NO_DEFER_POP;
+
   /* Precompute all register parameters.  It isn't safe to compute
 anything once we have started filling any specific hard regs.
 TLS symbols sometimes need a call to resolve.  Precompute
@@ -3151,6 +3155,8 @@ expand_call (tree exp, rtx target, int ignore)
 to avoid unaligned stack in the called function.  */
   precompute_register_parameters (num_actuals, args, ®_parm_seen);
 
+  OK_DEFER_POP;
+
   /* Perform stack alignment before the first push (the last arg).  */
   if (argblock == 0
   && adjusted_args_size.constant > reg_parm_stack_space
Index: testsuite/gcc.target/i386/pr66891.c
===
--- testsuite/gcc.target/i386/pr66891.c (revision 0)
+++ testsuite/gcc.target/i386/pr66891.c (revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2" } */
+
+__attribute__((__stdcall__)) void fn1();
+
+int a;
+
+static void fn2() {
+  for (;;)
+;
+}
+
+void fn3() {
+  fn1(0);
+  fn2(a == 0);
+}


[PATCH PR66388]Compute use with cand of smaller precision by further exercising scev overflow info.

2015-07-16 Thread Bin Cheng
Hi,
This patch is to fix PR66388.  It's an old issue but recently became worse
after my scev overflow change.  IVOPT now can only compute iv use with
candidate which has at least same type precision.  See below code:

  if (TYPE_PRECISION (utype) > TYPE_PRECISION (ctype))
{
  /* We do not have a precision to express the values of use.  */
  return infinite_cost;
}

This is not always true.  It's possible to compute with a candidate of
smaller precision if it has enough stepping periods to express the iv use.
Just as code in iv_elimination.  Well, since now we have iv no_overflow
information, we can use that to prove it's safe.  Actually I am thinking
about improving iv elimination with overflow information too.  So this patch
relaxes the constraint to allow computation of uses with smaller precision
candidates.

Benchmark data shows several cases in spec2k6 are obviously improved on
aarch64:
400.perlbench2.32%
445.gobmk0.86%
456.hmmer11.72%
464.h264ref  1.93%
473.astar0.75%
433.milc -1.49%
436.cactusADM6.61%
444.namd -0.76%

I looked into assembly code of 456.hmmer&436.cactusADM, and can confirm hot
loops are reduced.  Also perf data could confirm the improvement in
456.hmmer.  
I looked into 433.milc and found most hot functions are not affected by this
patch.  But I do observe two kinds of regressions described as below:
A)  For some loops, auto-increment addressing mode is generated before this
patch, but "base + index<

PR tree-optimization/66388
* tree-ssa-loop-ivopts.c (dump_iv): Dump no_overflow info.
(add_candidate_1): New parameter.  Use unsigned type when iv
overflows.  Pass no_overflow to alloc_iv.
(add_autoinc_candidates, add_candidate): New parameter.
Pass no_overflow to add_candidate_1.
(add_candidate): Ditto.
(add_iv_candidate_for_biv, add_iv_candidate_for_use): Pass iv's
no_overflow info to add_candidate and add_candidate_1.
(get_computation_aff, get_computation_cost_at): Handle candidate
with smaller precision than iv use.

gcc/testsuite/ChangeLog
2015-07-16  Bin Cheng  

PR tree-optimization/66388
* gcc.dg/tree-ssa/pr66388.c: New test.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr66388.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr66388.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr66388.c (revision 0)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts" } */
+
+int flag;
+int arr[144];
+
+void bar(int a, int b);
+int foo (int t)
+{
+  int step = t;
+
+  do
+{
+  if (!flag)
+   bar(t, 0);
+
+  t += step;
+}
+  while (arr[t] != 0);
+
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "t.\[0-9_\]* = PHI <.*, " 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-not "ivtmp.\[0-9_\]* = PHI <" "ivopts"} } */
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 225859)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -529,6 +529,9 @@ dump_iv (FILE *file, struct iv *iv, bool dump_name
 
   if (iv->biv_p)
 fprintf (file, "  is a biv\n");
+
+  if (iv->no_overflow)
+fprintf (file, "  iv doesn't overflow wrto loop niter\n");
 }
 
 /* Dumps information about the USE to FILE.  */
@@ -2598,21 +2601,23 @@ find_depends (tree *expr_p, int *ws ATTRIBUTE_UNUS
 /* Adds a candidate BASE + STEP * i.  Important field is set to IMPORTANT and
position to POS.  If USE is not NULL, the candidate is set as related to
it.  If both BASE and STEP are NULL, we add a pseudocandidate for the
-   replacement of the final value of the iv by a direct computation.  */
+   replacement of the final value of the iv by a direct computation.
+   NO_OVERFLOW is TRUE means the iv doesn't overflow with respect to loop's
+   niter information.  */
 
 static struct iv_cand *
 add_candidate_1 (struct ivopts_data *data,
 tree base, tree step, bool important, enum iv_position pos,
-struct iv_use *use, gimple incremented_at)
+struct iv_use *use, gimple incremented_at,
+bool no_overflow = false)
 {
   unsigned i;
   struct iv_cand *cand = NULL;
   tree type, orig_type;
 
-  /* For non-original variables, make sure their values are computed in a type
- that does not invoke undefined behavior on overflows (since in general,
- we cannot prove that these induction variables are non-wrapping).  */
-  if (pos != IP_ORIGINAL)
+  /* For non-original variables, compute their values in a type that does
+ not invoke undefined behavior on overflows if the iv might overflow.  */
+  if (pos != IP_ORIGINAL && !no_overflow)
 {
   orig_type = TREE_TYPE (base);
   type = generic_type_

Re: [PATCH] enable loop fusion with ISL scheduler

2015-07-16 Thread Tobias Grosser

On 07/17/2015 12:35 AM, Sebastian Pop wrote:

gcc/ChangeLog:

2015-07-16  Aditya Kumar  
 Sebastian Pop  

 * common.opt (floop-fuse): New.
 * doc/invoke.texi (floop-fuse): Documented.
 * graphite-optimize-isl.c (optimize_isl): Use
 ISL_SCHEDULE_FUSE_MAX when using flag_loop_fuse.
 * graphite-poly.c (apply_poly_transforms): Call optimize_isl when
 using flag_loop_fuse.
 * graphite.c (gate_graphite_transforms): Enable graphite with
 flag_loop_fuse.


LGTM.

Tobias


gcc/testsuite/ChangeLog:

2015-07-16  Aditya Kumar  
 Sebastian Pop  

 * gcc.dg/graphite/fuse-1.c: New test.
 * gcc.dg/graphite/fuse-2.c: New test.
---
  gcc/common.opt |  4 
  gcc/doc/invoke.texi| 23 +++-
  gcc/graphite-optimize-isl.c|  5 -
  gcc/graphite-poly.c|  2 +-
  gcc/graphite.c |  3 ++-
  gcc/testsuite/gcc.dg/graphite/fuse-1.c | 32 
  gcc/testsuite/gcc.dg/graphite/fuse-2.c | 38 ++
  7 files changed, 103 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/fuse-1.c
  create mode 100644 gcc/testsuite/gcc.dg/graphite/fuse-2.c

diff --git a/gcc/common.opt b/gcc/common.opt
index dd49ae3..200ecc1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1365,6 +1365,10 @@ floop-nest-optimize
  Common Report Var(flag_loop_optimize_isl) Optimization
  Enable the ISL based loop nest optimizer

+floop-fuse
+Common Report Var(flag_loop_fuse) Optimization
+Enable loop fusion
+
  fstrict-volatile-bitfields
  Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
  Force bitfield accesses to match their type width
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b99ab1c..7cc8bb9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}.
  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
  -flive-range-shrinkage @gol
  -floop-block -floop-interchange -floop-strip-mine @gol
--floop-unroll-and-jam -floop-nest-optimize @gol
+-floop-unroll-and-jam -floop-nest-optimize -floop-fuse @gol
  -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
  -flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants 
@gol
  -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
@@ -8796,6 +8796,27 @@ optimizer based on the Pluto optimization algorithms.  
It calculates a loop
  structure optimized for data-locality and parallelism.  This option
  is experimental.

+@item -floop-fuse
+@opindex floop-fuse
+Enable loop fusion.  This option is experimental.
+
+For example, given a loop like:
+@smallexample
+DO I = 1, N
+  A(I) = A(I) + B(I)
+ENDDO
+DO I = 1, N
+  A(I) = A(I) + C(I)
+ENDDO
+@end smallexample
+@noindent
+loop fusion transforms the loop as if it were written:
+@smallexample
+DO I = 1, N
+  A(I) = A(I) + B(I) + C(I)
+ENDDO
+@end smallexample
+
  @item -floop-unroll-and-jam
  @opindex floop-unroll-and-jam
  Enable unroll and jam for the ISL based loop nest optimizer.  The unroll
diff --git a/gcc/graphite-optimize-isl.c b/gcc/graphite-optimize-isl.c
index 624cc87..c016461 100644
--- a/gcc/graphite-optimize-isl.c
+++ b/gcc/graphite-optimize-isl.c
@@ -599,7 +599,10 @@ optimize_isl (scop_p scop)

isl_options_set_schedule_max_constant_term (scop->ctx, CONSTANT_BOUND);
isl_options_set_schedule_maximize_band_depth (scop->ctx, 1);
-  isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MIN);
+  if (flag_loop_fuse)
+isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MAX);
+  else
+isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MIN);
isl_options_set_on_error (scop->ctx, ISL_ON_ERROR_CONTINUE);

  #ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index 4407dc5..4808fbe 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -272,7 +272,7 @@ apply_poly_transforms (scop_p scop)

/* This pass needs to be run at the final stage, as it does not
   update the lst.  */
-  if (flag_loop_optimize_isl || flag_loop_unroll_jam)
+  if (flag_loop_optimize_isl || flag_loop_unroll_jam || flag_loop_fuse)
  transform_done |= optimize_isl (scop);

return transform_done;
diff --git a/gcc/graphite.c b/gcc/graphite.c
index ba8029a..51af1a2a 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -342,7 +342,8 @@ gate_graphite_transforms (void)
|| flag_graphite_identity
|| flag_loop_parallelize_all
|| flag_loop_optimize_isl
-  || flag_loop_unroll_jam)
+  || flag_loop_unroll_jam
+  || flag_loop_fuse)
  flag_graphite = 1;

return flag_graphite != 0;
diff --git a/gcc/testsuite/gcc.dg/graphite/fuse-1.c 
b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
new file mode 100644
index 000

Re: [PATCH] [graphite] fix pr61929

2015-07-16 Thread Tobias Grosser

On 07/17/2015 12:23 AM, Sebastian Pop wrote:

This fixes bootstrap of GCC with BOOT_CFLAGS="-g -O2 -fgraphite-identity
-floop-nest-optimize -floop-block -floop-interchange -floop-strip-mine".
It passes regstrap on amd64-linux.

Ok to commit to trunk?


Very nice.

this but  seems to include multiple fixes and refactorings at the same 
time. I personally would prefer to at least separate the functional and 
non-functional changes, such that the actual bug fixed becomes clear.
If splitting the patch is difficult, can you at least point out in the 
commit message, which of your changes now actually fixed the bootstrap?


Otherwise, LGTM.

Best,
Tobias


Thanks.

2015-07-15  Aditya Kumar  
Sebastian Pop  

PR middle-end/61929
* graphite-dependences.c (add_pdr_constraints): Renamed
pdr->extent to pdr->subscript_sizes.
* graphite-interchange.c (build_linearized_memory_access): Add
back all gcc_assert's that the "isl_int to isl_val conversion"
patch has removed.  Refactored.
(pdr_stride_in_loop): Renamed pdr->extent to pdr->subscript_sizes.
* graphite-poly.c (new_poly_dr): Same.
(free_poly_dr): Same.
* graphite-poly.h (struct poly_dr): Same.
* graphite-scop-detection.c (stmt_has_simple_data_refs_p): Ignore
all data references other than ARRAY_REF and MEM_REF.
* graphite-scop-detection.h: Fix space.
* graphite-sese-to-poly.c (build_pbb_scattering_polyhedrons): Add
back all gcc_assert's removed by a previous patch.
(wrap): Remove the_isl_ctx global variable that the same patch has
added.
(build_loop_iteration_domains): Same.
(add_param_constraints): Same.
(pdr_add_data_dimensions): Same.  Refactored.
(build_poly_dr): Renamed extent to subscript_sizes.

testsuite/
PR middle-end/61929
* gcc.dg/graphite/pr61929.c: New.
---
  gcc/graphite-dependences.c  |  4 +--
  gcc/graphite-interchange.c  | 55 +
  gcc/graphite-poly.c |  6 ++--
  gcc/graphite-poly.h |  2 +-
  gcc/graphite-scop-detection.c   | 22 +
  gcc/graphite-scop-detection.h   |  2 +-
  gcc/graphite-sese-to-poly.c | 54 
  gcc/testsuite/gcc.dg/graphite/pr61929.c | 19 
  8 files changed, 97 insertions(+), 67 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/pr61929.c

diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index 50fe73e..af18ecb 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -88,13 +88,13 @@ constrain_domain (isl_map *map, isl_set *s)
return isl_map_intersect_domain (map, s);
  }

-/* Constrain pdr->accesses with pdr->extent and pbb->domain.  */
+/* Constrain pdr->accesses with pdr->subscript_sizes and pbb->domain.  */

  static isl_map *
  add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
  {
isl_map *x = isl_map_intersect_range (isl_map_copy (pdr->accesses),
-   isl_set_copy (pdr->extent));
+   isl_set_copy (pdr->subscript_sizes));
x = constrain_domain (x, isl_set_copy (pbb->domain));
return x;
  }
diff --git a/gcc/graphite-interchange.c b/gcc/graphite-interchange.c
index aee51a8..03c2c63 100644
--- a/gcc/graphite-interchange.c
+++ b/gcc/graphite-interchange.c
@@ -79,37 +79,40 @@ extern "C" {
  static isl_constraint *
  build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
  {
-  isl_constraint *res;
isl_local_space *ls = isl_local_space_from_space (isl_map_get_space (map));
-  unsigned offset, nsubs;
-  int i;
-  isl_ctx *ctx;
+  isl_constraint *res = isl_equality_alloc (ls);
+  isl_val *size = isl_val_int_from_ui (isl_map_get_ctx (map), 1);

-  isl_val *size, *subsize, *size1;
-
-  res = isl_equality_alloc (ls);
-  ctx = isl_local_space_get_ctx (ls);
-  size = isl_val_int_from_ui (ctx, 1);
-
-  nsubs = isl_set_dim (pdr->extent, isl_dim_set);
+  unsigned nsubs = isl_set_dim (pdr->subscript_sizes, isl_dim_set);
/* -1 for the already included L dimension.  */
-  offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
+  unsigned offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
res = isl_constraint_set_coefficient_si (res, isl_dim_out, offset + nsubs, 
-1);
-  /* Go through all subscripts from last to first.  First dimension
+  /* Go through all subscripts from last to first.  The dimension "i=0"
   is the alias set, ignore it.  */
-  for (i = nsubs - 1; i >= 1; i--)
+  for (int i = nsubs - 1; i >= 1; i--)
  {
-  isl_space *dc;
-  isl_aff *aff;
-
-  size1 = isl_val_copy (size);
-  res = isl_constraint_set_coefficient_val (res, isl_dim_out, offset + i, 
size);
-  dc = isl_set_get_space (pdr->extent);
-  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
-  aff = isl_aff

Re: [PATCH] [4/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
> Since all configure files are generated from them, this patch must be
> checked in first.
> But some of them are imported and some imported packages still use 
> configure.in,
> not configure.ac.
>
> What is the real value of changing "configure.in" in comments/messages to
> "configure.ac" when both are used in packages?

Before binutils commit 35eafcc7 a year ago, I'm pretty sure everything
in binutils-gdb used the .in extension.  And, around that time, I'm
pretty sure everything in gcc also used the .in extension.
Binutils-gdb partially moved over to the .ac extension, and gcc
completely moved over to the .ac extension.

This left a lot of references pointing to the wrong extension.
Allowing both extensions, even if made to work now, will break again
someday.  I think having comments, messages, and documentation point
semi-randomly to one or the other is inviting future confusion.

I think the only way to permanently fix this is to complete the
(almost complete) transition to the .ac extension.  I would have
probably personally left everything as a .in extension, since for now
there's no real difference, but I think the transition should either
be complete or not there at all.  Since the conversion already
started, I think all references anywhere to the .in extension should
be updated.  (Unless in a historical context like a ChangeLog.)

Which imported packages use configure.in?  I'm happy to submit patches
for those, too.


Re: [PATCH] [4/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread H.J. Lu
On Thu, Jul 16, 2015 at 8:09 PM, Michael Darling  wrote:
> (I was requested by binutils to split my May 24 and Jul 16 patches
> into separate files for each binutils-gdb main subdirectory.)
>
> Combined builds has been broken for about 10 months, because some binutils
> configure.in files were renamed to configure.ac, but gcc's references to them
> were not updated.  Fixing gcc's references to them is much easier by renaming
> the few straggling configure.in files to configure.ac. gcc's configure.in
> files were entirely renamed to configure.ac some time ago. There are
> corresponding patches submitted to gcc, which updates all references to
> binutils-gdb configure.in files to configure.ac, which is what ultimately
> fixes combined builds.
>
> See PR binutils-gdb/binutils/18450 and gcc/other/66259.
>
> Signed-off by: Michael Darling 
> ---
>  config/ChangeLog  | 8 
>  config/gettext.m4 | 4 ++--
>  config/po.m4  | 4 ++--
>  config/stdint.m4  | 2 +-
>  config/tcl.m4 | 4 ++--
>  5 files changed, 15 insertions(+), 7 deletions(-)

Since all configure files are generated from them, this patch must be
checked in first.
But some of them are imported and some imported packages still use configure.in,
not configure.ac.

What is the real value of changing "configure.in" in comments/messages to
"configure.ac" when both are used in packages?

-- 
H.J.


Re: [PATCH][doc][13/14] Document AArch64 target attributes and pragmas

2015-07-16 Thread Sandra Loosemore

On 07/16/2015 09:21 AM, Kyrill Tkachov wrote:

Hi all,

This patch adds the documentation for the AArch64 target attributes and
pragmas.

Ok for trunk?


The content looks OK, but I have a bunch of nit-picky comments about 
grammar, typos, markup, etc



+The following target-specific function attributes are available for
+the AArch64 target and for the most part mirror the behavior of similar
+command line options, but on a per-function basis:


s/command line option/command-line option/g

It would be good to add a cross-reference to the section where the 
command-line options are documented.  I recommend splitting the 
introductory sentence into two, like:


The following target-specific function attributes are available for the 
AArch64 target.  For the most part, these options mirror the behavior of
similar command-line options (@pxref{AArch64 Options}), but on a 
per-function basis.



+
+@table @code
+@item general-regs-only
+@cindex @code{general-regs-only} function attribute, AArch64
+Indicates that no floating point or AdvancedSIMD registers should be


s/floating point/floating-point/
s/AdvancedSIMD/Advanced SIMD/


+used when generating code for this function.  If the function explicitly
+uses floating point code, then the compiler will give an error.  This is


s/floating point code/floating-point code/
s/will give/gives/


+the same behavior as that of the command line option
+@code{-mgeneral-regs-only}.


Please use @option markup instead of @code on option names throughout 
this patch.



+@item cmodel=
+@cindex @code{cmodel=} function attribute, AArch64
+Indicates that code should be generated for a particular code model for
+this function.  The behaviour and permissible arguments are the same as


s/behaviour/behavior/

(We prefer to consistently use American spellings throughout the GCC 
documentation.)



+@item strict-align
+@cindex @code{strit-align} function attribute, AArch64


s/strit-align/strict-align/


+The above target attributes can be specified as follows:
+
+@smallexample
+__attribute__((target("")))
+int
+f (int a)
+@{
+  return a + 5;
+@}
+@end smallexample
+
+where @code{} is one of the attribute strings specified above.


s//@var{attr-string}/g


+In this example @code{target("+crc+nocrypto")} will enable the @code{crc}
+extension and disable the @code{crypto} extension for the function @code{foo}


s/will enable/enables/
s/disable/disables/


+is valid and will compile function @code{foo} for ARMv8-A with @code{crc}
+and @code{crypto} extensions and tune it for @code{cortex-a53}.


s/will compile/compiles/
s/tune/tunes/


+@code{-mcpu=} optio or the @code{cpu=} attribute conflicts with the

s/optio/option/


@@ -18159,6 +18299,19 @@ for further explanation.
 * Loop-Specific Pragmas::
 @end menu

+@node AArch64 Pragmas
+@subsection AArch64 Pragmas
+
+The pragmas defined by the AArch64 target correspond to the AArch64
+target function attributes.  They can be specified as below:
+@smallexample
+#pragma GCC target("")
+@end smallexample
+
+where @code{} can be any string accepted as an AArch64 target
+attribute.  @xref{AArch64 Function Attributes} for more details
+on the permissible values of @code{}.


s//@var{string}/g

-Sandra




[PATCH] [14/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 sim/ChangeLog | 5 +
 sim/configure | 2 +-
 sim/testsuite/ChangeLog   | 6 ++
 sim/testsuite/Makefile.in | 2 +-
 sim/testsuite/configure   | 2 +-
 5 files changed, 14 insertions(+), 3 deletions(-)


0001-14-14-Completes-renaming-of-configure.in-files-to-co.patch
Description: Binary data


[PATCH] [13/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 readline/ChangeLog.gdb  |  10 +
 readline/INSTALL|   6 +-
 readline/MANIFEST   |   4 +-
 readline/Makefile.in|   2 +-
 readline/aclocal.m4 |  12 +-
 readline/configure  |   2 +-
 readline/configure.ac   | 304 +
 readline/configure.in   | 304 -
 readline/doc/ChangeLog.gdb  |   5 +
 readline/doc/texi2html  |   2 +-
 readline/examples/rlfe/ChangeLog|   6 +
 readline/examples/rlfe/Makefile.in  |   4 +-
 readline/examples/rlfe/configure.ac | 442 
 readline/examples/rlfe/configure.in | 442 
 14 files changed, 783 insertions(+), 762 deletions(-)
 create mode 100644 readline/configure.ac
 delete mode 100644 readline/configure.in
 create mode 100644 readline/examples/rlfe/configure.ac
 delete mode 100644 readline/examples/rlfe/configure.in


0001-13-14-Completes-renaming-of-configure.in-files-to-co.patch
Description: Binary data


[PATCH] [12/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 opcodes/ChangeLog | 5 +
 opcodes/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-12-14-Completes-renaming-of-configure.in-files-to-co.patch
Description: Binary data


[PATCH] [11/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 ld/ChangeLog | 5 +
 ld/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-11-14-Completes-renaming-of-configure.in-files-to-co.patch
Description: Binary data


[PATCH] [10/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 intl/ChangeLog | 5 +
 intl/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-10-14-Completes-renaming-of-configure.in-files-to-co.patch
Description: Binary data


[PATCH] [8/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 gold/ChangeLog | 5 +
 gold/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-8-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [7/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 gdb/ChangeLog | 7 +++
 gdb/acinclude.m4  | 2 +-
 gdb/config/djgpp/fnchange.lst | 2 +-
 gdb/configure | 2 +-
 gdb/gdbserver/ChangeLog   | 5 +
 gdb/gdbserver/acinclude.m4| 2 +-
 gdb/testsuite/ChangeLog   | 5 +
 gdb/testsuite/configure   | 2 +-
 8 files changed, 22 insertions(+), 5 deletions(-)


0001-7-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [9/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 gprof/ChangeLog | 5 +
 gprof/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-9-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [4/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 config/ChangeLog  | 8 
 config/gettext.m4 | 4 ++--
 config/po.m4  | 4 ++--
 config/stdint.m4  | 2 +-
 config/tcl.m4 | 4 ++--
 5 files changed, 15 insertions(+), 7 deletions(-)


0001-4-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [6/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 gas/ChangeLog | 5 +
 gas/configure | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)


0001-6-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [5/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 etc/ChangeLog|  6 ++
 etc/Makefile.in  |  2 +-
 etc/configure.ac | 27 +++
 etc/configure.in | 27 ---
 4 files changed, 34 insertions(+), 28 deletions(-)
 create mode 100644 etc/configure.ac
 delete mode 100644 etc/configure.in


0001-5-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [3/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 binutils/ChangeLog   | 6 ++
 binutils/MAINTAINERS | 2 +-
 binutils/configure   | 4 ++--
 3 files changed, 9 insertions(+), 3 deletions(-)


0001-3-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [1/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 ChangeLog  | 8 
 config-ml.in   | 6 +++---
 configure  | 8 
 configure.ac   | 8 
 src-release.sh | 2 +-
 5 files changed, 20 insertions(+), 12 deletions(-)


0001-1-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[PATCH] [2/14] Completes renaming of configure.in files to .ac

2015-07-16 Thread Michael Darling
(I was requested by binutils to split my May 24 and Jul 16 patches
into separate files for each binutils-gdb main subdirectory.)

Combined builds has been broken for about 10 months, because some binutils
configure.in files were renamed to configure.ac, but gcc's references to them
were not updated.  Fixing gcc's references to them is much easier by renaming
the few straggling configure.in files to configure.ac. gcc's configure.in
files were entirely renamed to configure.ac some time ago. There are
corresponding patches submitted to gcc, which updates all references to
binutils-gdb configure.in files to configure.ac, which is what ultimately
fixes combined builds.

See PR binutils-gdb/binutils/18450 and gcc/other/66259.

Signed-off by: Michael Darling 
---
 bfd/ChangeLog | 6 ++
 bfd/configure | 4 ++--
 bfd/configure.com | 6 +++---
 3 files changed, 11 insertions(+), 5 deletions(-)


0001-2-14-Completes-renaming-of-configure.in-files-to-con.patch
Description: Binary data


[gomp4.1] handle undeclared sink variables gracefully

2015-07-16 Thread Aldy Hernandez

The following:

#pragma omp ordered depend(sink:asdf)

...where asdf is undeclared, is ICEing in *finish_omp_clauses because we 
create a clause with NULL, and we're expecting a TREE_LIST.


In the attached patch, I have opted to avoid creating the ordered depend 
clause if we have a parse error.


I also noticed that we gave up after one undeclared sink variable.  We 
can do better.  We can keep parsing and generate error messages 
appropriately, but avoid adding the undeclared variables to the 
TREE_LIST.  As an alternative, we could avoid generating ANY sink clause 
(even for declared variables) if we encounter any problem, but so far we 
fail gracefully later, so I haven't done this.


OK for branch?
commit 6ec528841cee875cfd0bcac0e35f5a6db1df0f6b
Author: Aldy Hernandez 
Date:   Thu Jul 16 16:38:19 2015 -0700

c/
* c-parser.c (c_parser_omp_clause_depend_sink): Handle multiple
undeclared sink variables gracefully.
cp/
* parser.c (cp_parser_omp_clause_depend_sink): Handle multiple
undeclared sink variables gracefully.
testsuite/
* c-c++-common/gomp/sink-3.c: New test.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 0909223..2d43cd7 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11878,55 +11878,55 @@ c_parser_omp_clause_depend_sink (c_parser *parser, 
location_t clause_loc,
 
   c_parser_consume_token (parser);
 
-  if (t != error_mark_node)
+  bool neg;
+  if (c_parser_next_token_is (parser, CPP_MINUS))
+   neg = true;
+  else if (c_parser_next_token_is (parser, CPP_PLUS))
+   neg = false;
+  else
{
- bool neg;
-
- if (c_parser_next_token_is (parser, CPP_MINUS))
-   neg = true;
- else if (c_parser_next_token_is (parser, CPP_PLUS))
-   neg = false;
- else
-   {
- addend = integer_zero_node;
- goto add_to_vector;
-   }
- c_parser_consume_token (parser);
+ addend = integer_zero_node;
+ goto add_to_vector;
+   }
+  c_parser_consume_token (parser);
 
- if (c_parser_next_token_is_not (parser, CPP_NUMBER))
-   {
- c_parser_error (parser, "expected integer");
- return list;
-   }
+  if (c_parser_next_token_is_not (parser, CPP_NUMBER))
+   {
+ c_parser_error (parser, "expected integer");
+ return list;
+   }
 
- addend = c_parser_peek_token (parser)->value;
- if (TREE_CODE (addend) != INTEGER_CST)
-   {
- c_parser_error (parser, "expected integer");
- return list;
-   }
- if (neg)
-   {
- bool overflow;
- wide_int offset = wi::neg (addend, &overflow);
- addend = wide_int_to_tree (TREE_TYPE (addend), offset);
- if (overflow)
-   warning_at (c_parser_peek_token (parser)->location,
-   OPT_Woverflow,
-   "overflow in implicit constant conversion");
-   }
- c_parser_consume_token (parser);
+  addend = c_parser_peek_token (parser)->value;
+  if (TREE_CODE (addend) != INTEGER_CST)
+   {
+ c_parser_error (parser, "expected integer");
+ return list;
+   }
+  if (neg)
+   {
+ bool overflow;
+ wide_int offset = wi::neg (addend, &overflow);
+ addend = wide_int_to_tree (TREE_TYPE (addend), offset);
+ if (overflow)
+   warning_at (c_parser_peek_token (parser)->location,
+   OPT_Woverflow,
+   "overflow in implicit constant conversion");
+   }
+  c_parser_consume_token (parser);
 
-   add_to_vector:
- vec = tree_cons (addend, t, vec);
+add_to_vector:
+  if (t != error_mark_node)
+   vec = tree_cons (addend, t, vec);
 
- if (c_parser_next_token_is_not (parser, CPP_COMMA))
-   break;
+  if (c_parser_next_token_is_not (parser, CPP_COMMA))
+   break;
 
- c_parser_consume_token (parser);
-   }
+  c_parser_consume_token (parser);
 }
 
+  if (vec == NULL_TREE)
+return list;
+
   tree u = build_omp_clause (clause_loc, OMP_CLAUSE_DEPEND);
   OMP_CLAUSE_DEPEND_KIND (u) = OMP_CLAUSE_DEPEND_SINK;
   OMP_CLAUSE_DECL (u) = nreverse (vec);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 2b6ed0a..3e1b167 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29483,61 +29483,61 @@ cp_parser_omp_clause_depend_sink (cp_parser *parser, 
location_t clause_loc,
 id_loc);
}
 
-  if (t != error_mark_node)
+  bool neg;
+  if (cp_lexer_next_token_is (parser->lexer, CPP_MINUS))
+   neg = true;
+  else if (cp_lexer_next_token_is (parser->lexer, CPP_PLUS))
+   neg = false;
+  else
{
- bool neg;
-
- if (cp_lexer_next_token_is (parser->lexer, CPP_MINUS))
- 

Re: [PATCH] Fixes accidental renaming of gdb.py file (i.e. libstdc++.so.6.0.22-gdb.py)

2015-07-16 Thread Michael Darling
Ping.  I don't have write access.

The attached patch still works with current trunk source, only failing
on the ChangeLog due to submissions since then.

On Fri, Jul 3, 2015 at 9:50 PM, Michael Darling  wrote:
> The addition of libstdc++fs broke an inexact and fragile method in the
> libstdc++-v3/python makefile, so it mis-names a python script after
> libstdc++fs rather than libstdc++.
>
> With DESTDIR /usr/lib, toolexeclibdir ../lib, and the .so version of
> 6.0.21, this makefile used to install the python script to
> /usr/lib/libstdc++.so.6.0.21-gdb.py.
>
> Once libstdc++fs was added, this makefile installs the python script
> to /usr/lib/libstdc++fs.a-gdb.py.
>
> This makefile examines files named libstdc++* in
> DESTDIR/toolexeclibdir, excluding: symlinks; *.la files; and previous
> *-gdb.py files.  Its comments report it is done this way because
> "libtool hides the real names from us".
>
> This patch changes the makefile so it examines files named libstdc++.*
> (notice the addition of the dot.)  Although this is still not an
> optimum method, it at least puts the makefile on the right track
> again.  Adding the dot is more future-proof than excluding files
> starting with libstdc++fs, because of the possibility of future
> additions of similarly named libraries.
>
> The patch below is also an attachment to this email.
>
>
>
> Index: libstdc++-v3/ChangeLog
> ===
> --- libstdc++-v3/ChangeLog(revision 225409)
> +++ libstdc++-v3/ChangeLog(working copy)
> @@ -1,3 +1,9 @@
> +2015-07-03  Michael Darling  
> +
> +* python/Makefile.am: python script name based off libstdc++.* rather
> +than libstdc++*, to avoid being mis-named after libstdc++fs.
> +* python/Makefile.in: Regenerate.
> +
>  2015-07-03  Jonathan Wakely  
>
>  * doc/xml/manual/status_cxx2017.xml: Update status table.
> Index: libstdc++-v3/python/Makefile.am
> ===
> --- libstdc++-v3/python/Makefile.am(revision 225409)
> +++ libstdc++-v3/python/Makefile.am(working copy)
> @@ -45,11 +45,11 @@
>  @$(mkdir_p) $(DESTDIR)$(toolexeclibdir)
>  ## We want to install gdb.py as SOMETHING-gdb.py.  SOMETHING is the
>  ## full name of the final library.  We want to ignore symlinks, the
> -## .la file, and any previous -gdb.py file.  This is inherently
> -## fragile, but there does not seem to be a better option, because
> -## libtool hides the real names from us.
> +## .la file, any previous -gdb.py file, and libstdc++fs*.  This is
> +## inherently fragile, but there does not seem to be a better option,
> +## because libtool hides the real names from us.
>  @here=`pwd`; cd $(DESTDIR)$(toolexeclibdir); \
> -  for file in libstdc++*; do \
> +  for file in libstdc++.*; do \
>  case $$file in \
>*-gdb.py) ;; \
>*.la) ;; \
> Index: libstdc++-v3/python/Makefile.in
> ===
> --- libstdc++-v3/python/Makefile.in(revision 225409)
> +++ libstdc++-v3/python/Makefile.in(working copy)
> @@ -547,7 +547,7 @@
>  install-data-local: gdb.py
>  @$(mkdir_p) $(DESTDIR)$(toolexeclibdir)
>  @here=`pwd`; cd $(DESTDIR)$(toolexeclibdir); \
> -  for file in libstdc++*; do \
> +  for file in libstdc++.*; do \
>  case $$file in \
>*-gdb.py) ;; \
>*.la) ;; \


gcc.libstdc++-v3.python.dot.fix.patch
Description: Binary data


Re: [PATCH] Fix-up for PR fortran/66724 and fortran/66725

2015-07-16 Thread FX
> 2015-07-16  Steven G. Kargl  
> 
>   * io.c (is_char_type): Call gfc_resolve_expr().
>   (match_open_element, match_dt_element, match_inquire_element): Fix
>   ASYNCHRONOUS case.

OK to commit


[PATCH] enable loop fusion with ISL scheduler

2015-07-16 Thread Sebastian Pop
gcc/ChangeLog:

2015-07-16  Aditya Kumar  
Sebastian Pop  

* common.opt (floop-fuse): New.
* doc/invoke.texi (floop-fuse): Documented.
* graphite-optimize-isl.c (optimize_isl): Use
ISL_SCHEDULE_FUSE_MAX when using flag_loop_fuse.
* graphite-poly.c (apply_poly_transforms): Call optimize_isl when
using flag_loop_fuse.
* graphite.c (gate_graphite_transforms): Enable graphite with
flag_loop_fuse.

gcc/testsuite/ChangeLog:

2015-07-16  Aditya Kumar  
Sebastian Pop  

* gcc.dg/graphite/fuse-1.c: New test.
* gcc.dg/graphite/fuse-2.c: New test.
---
 gcc/common.opt |  4 
 gcc/doc/invoke.texi| 23 +++-
 gcc/graphite-optimize-isl.c|  5 -
 gcc/graphite-poly.c|  2 +-
 gcc/graphite.c |  3 ++-
 gcc/testsuite/gcc.dg/graphite/fuse-1.c | 32 
 gcc/testsuite/gcc.dg/graphite/fuse-2.c | 38 ++
 7 files changed, 103 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/fuse-1.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/fuse-2.c

diff --git a/gcc/common.opt b/gcc/common.opt
index dd49ae3..200ecc1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1365,6 +1365,10 @@ floop-nest-optimize
 Common Report Var(flag_loop_optimize_isl) Optimization
 Enable the ISL based loop nest optimizer
 
+floop-fuse
+Common Report Var(flag_loop_fuse) Optimization
+Enable loop fusion
+
 fstrict-volatile-bitfields
 Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
 Force bitfield accesses to match their type width
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b99ab1c..7cc8bb9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}.
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
 -flive-range-shrinkage @gol
 -floop-block -floop-interchange -floop-strip-mine @gol
--floop-unroll-and-jam -floop-nest-optimize @gol
+-floop-unroll-and-jam -floop-nest-optimize -floop-fuse @gol
 -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants 
@gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
@@ -8796,6 +8796,27 @@ optimizer based on the Pluto optimization algorithms.  
It calculates a loop
 structure optimized for data-locality and parallelism.  This option
 is experimental.
 
+@item -floop-fuse
+@opindex floop-fuse
+Enable loop fusion.  This option is experimental.
+
+For example, given a loop like:
+@smallexample
+DO I = 1, N
+  A(I) = A(I) + B(I)
+ENDDO
+DO I = 1, N
+  A(I) = A(I) + C(I)
+ENDDO
+@end smallexample
+@noindent
+loop fusion transforms the loop as if it were written:
+@smallexample
+DO I = 1, N
+  A(I) = A(I) + B(I) + C(I)
+ENDDO
+@end smallexample
+
 @item -floop-unroll-and-jam
 @opindex floop-unroll-and-jam
 Enable unroll and jam for the ISL based loop nest optimizer.  The unroll 
diff --git a/gcc/graphite-optimize-isl.c b/gcc/graphite-optimize-isl.c
index 624cc87..c016461 100644
--- a/gcc/graphite-optimize-isl.c
+++ b/gcc/graphite-optimize-isl.c
@@ -599,7 +599,10 @@ optimize_isl (scop_p scop)
 
   isl_options_set_schedule_max_constant_term (scop->ctx, CONSTANT_BOUND);
   isl_options_set_schedule_maximize_band_depth (scop->ctx, 1);
-  isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MIN);
+  if (flag_loop_fuse)
+isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MAX);
+  else
+isl_options_set_schedule_fuse (scop->ctx, ISL_SCHEDULE_FUSE_MIN);
   isl_options_set_on_error (scop->ctx, ISL_ON_ERROR_CONTINUE);
 
 #ifdef HAVE_ISL_SCHED_CONSTRAINTS_COMPUTE_SCHEDULE
diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index 4407dc5..4808fbe 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -272,7 +272,7 @@ apply_poly_transforms (scop_p scop)
 
   /* This pass needs to be run at the final stage, as it does not
  update the lst.  */
-  if (flag_loop_optimize_isl || flag_loop_unroll_jam)
+  if (flag_loop_optimize_isl || flag_loop_unroll_jam || flag_loop_fuse)
 transform_done |= optimize_isl (scop);
 
   return transform_done;
diff --git a/gcc/graphite.c b/gcc/graphite.c
index ba8029a..51af1a2a 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -342,7 +342,8 @@ gate_graphite_transforms (void)
   || flag_graphite_identity
   || flag_loop_parallelize_all
   || flag_loop_optimize_isl
-  || flag_loop_unroll_jam)
+  || flag_loop_unroll_jam
+  || flag_loop_fuse)
 flag_graphite = 1;
 
   return flag_graphite != 0;
diff --git a/gcc/testsuite/gcc.dg/graphite/fuse-1.c 
b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
new file mode 100644
index 000..f368f47
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
@@ -0,0 +1,32 @@
+/* Check that the tw

[PATCH] [graphite] fix pr61929

2015-07-16 Thread Sebastian Pop
This fixes bootstrap of GCC with BOOT_CFLAGS="-g -O2 -fgraphite-identity
-floop-nest-optimize -floop-block -floop-interchange -floop-strip-mine".
It passes regstrap on amd64-linux.

Ok to commit to trunk?
Thanks.

2015-07-15  Aditya Kumar  
Sebastian Pop  

PR middle-end/61929
* graphite-dependences.c (add_pdr_constraints): Renamed
pdr->extent to pdr->subscript_sizes.
* graphite-interchange.c (build_linearized_memory_access): Add
back all gcc_assert's that the "isl_int to isl_val conversion"
patch has removed.  Refactored.
(pdr_stride_in_loop): Renamed pdr->extent to pdr->subscript_sizes.
* graphite-poly.c (new_poly_dr): Same.
(free_poly_dr): Same.
* graphite-poly.h (struct poly_dr): Same.
* graphite-scop-detection.c (stmt_has_simple_data_refs_p): Ignore
all data references other than ARRAY_REF and MEM_REF.
* graphite-scop-detection.h: Fix space.
* graphite-sese-to-poly.c (build_pbb_scattering_polyhedrons): Add
back all gcc_assert's removed by a previous patch.
(wrap): Remove the_isl_ctx global variable that the same patch has
added.
(build_loop_iteration_domains): Same.
(add_param_constraints): Same.
(pdr_add_data_dimensions): Same.  Refactored.
(build_poly_dr): Renamed extent to subscript_sizes.

testsuite/
PR middle-end/61929
* gcc.dg/graphite/pr61929.c: New.
---
 gcc/graphite-dependences.c  |  4 +--
 gcc/graphite-interchange.c  | 55 +
 gcc/graphite-poly.c |  6 ++--
 gcc/graphite-poly.h |  2 +-
 gcc/graphite-scop-detection.c   | 22 +
 gcc/graphite-scop-detection.h   |  2 +-
 gcc/graphite-sese-to-poly.c | 54 
 gcc/testsuite/gcc.dg/graphite/pr61929.c | 19 
 8 files changed, 97 insertions(+), 67 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/pr61929.c

diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index 50fe73e..af18ecb 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -88,13 +88,13 @@ constrain_domain (isl_map *map, isl_set *s)
   return isl_map_intersect_domain (map, s);
 }
 
-/* Constrain pdr->accesses with pdr->extent and pbb->domain.  */
+/* Constrain pdr->accesses with pdr->subscript_sizes and pbb->domain.  */
 
 static isl_map *
 add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
 {
   isl_map *x = isl_map_intersect_range (isl_map_copy (pdr->accesses),
-   isl_set_copy (pdr->extent));
+   isl_set_copy (pdr->subscript_sizes));
   x = constrain_domain (x, isl_set_copy (pbb->domain));
   return x;
 }
diff --git a/gcc/graphite-interchange.c b/gcc/graphite-interchange.c
index aee51a8..03c2c63 100644
--- a/gcc/graphite-interchange.c
+++ b/gcc/graphite-interchange.c
@@ -79,37 +79,40 @@ extern "C" {
 static isl_constraint *
 build_linearized_memory_access (isl_map *map, poly_dr_p pdr)
 {
-  isl_constraint *res;
   isl_local_space *ls = isl_local_space_from_space (isl_map_get_space (map));
-  unsigned offset, nsubs;
-  int i;
-  isl_ctx *ctx;
+  isl_constraint *res = isl_equality_alloc (ls);
+  isl_val *size = isl_val_int_from_ui (isl_map_get_ctx (map), 1);
 
-  isl_val *size, *subsize, *size1;
-
-  res = isl_equality_alloc (ls);
-  ctx = isl_local_space_get_ctx (ls);
-  size = isl_val_int_from_ui (ctx, 1);
-
-  nsubs = isl_set_dim (pdr->extent, isl_dim_set);
+  unsigned nsubs = isl_set_dim (pdr->subscript_sizes, isl_dim_set);
   /* -1 for the already included L dimension.  */
-  offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
+  unsigned offset = isl_map_dim (map, isl_dim_out) - 1 - nsubs;
   res = isl_constraint_set_coefficient_si (res, isl_dim_out, offset + nsubs, 
-1);
-  /* Go through all subscripts from last to first.  First dimension
+  /* Go through all subscripts from last to first.  The dimension "i=0"
  is the alias set, ignore it.  */
-  for (i = nsubs - 1; i >= 1; i--)
+  for (int i = nsubs - 1; i >= 1; i--)
 {
-  isl_space *dc;
-  isl_aff *aff;
-
-  size1 = isl_val_copy (size);
-  res = isl_constraint_set_coefficient_val (res, isl_dim_out, offset + i, 
size);
-  dc = isl_set_get_space (pdr->extent);
-  aff = isl_aff_zero_on_domain (isl_local_space_from_space (dc));
-  aff = isl_aff_set_coefficient_si (aff, isl_dim_in, i, 1);
-  subsize = isl_set_max_val (pdr->extent, aff);
-  isl_aff_free (aff);
-  size = isl_val_mul (size1, subsize);
+  isl_aff *extract_dim;
+  res = isl_constraint_set_coefficient_val (res, isl_dim_out, offset + i,
+   isl_val_copy (size));
+  isl_space *dc = isl_set_get_space (pdr->subscript_sizes);
+  extract_dim = isl_aff_zero_on_domain (isl_local_space_from_s

Re: *Ping* Re: [Patch, fortran] PR61831 side-effect deallocation of variable components

2015-07-16 Thread Steve Kargl
On Fri, Jul 10, 2015 at 06:35:30PM +0200, Mikael Morin wrote:
> Ping: https://gcc.gnu.org/ml/fortran/2015-06/msg00075.html
> 

Patch looks ok to me.

-- 
Steve


Re: [Bug fortran/52846] [F2008] Support submodules - part 2/3 - redux

2015-07-16 Thread Steve Kargl
See any dictionary for the definition of 'old'. :-)

Last time I used VMS in 1990i, and upgrading to 
newer versions of software on the systems required 
a few hurdles to clear.

On Thu, Jul 16, 2015 at 02:13:54PM -0700, Douglas B Rupp wrote:
> VMS hasn't had a 3 character filename extension limit since before 
> VAX/VMS version 4.0 was released in 1984.
> 
> On 07/16/2015 01:41 PM, Steve Kargl wrote:
> > On Thu, Jul 16, 2015 at 05:08:50PM +0200, Paul Richard Thomas wrote:
> >>
> >> Please find attached a new version of the patch that fixes the
> >> inconsistency with the standard, pointed out by Reinhold. It is weird
> >> but a read the appropriate part of the standard several times and
> >> simply did not pick up the critical information :-)
> >>
> >> Note that the delimiter used for submodule file name is '@', whereas
> >> the internal identifiers is '.'.
> >>
> >> I have added a procedure to cleanup submodules produced by the
> >> testsuite and implemented them in submodule_[1-8].f90. Submodule_8.f90
> >> tests the resolution of the spurious error found by Reinhold.
> >>
> >> Booststraps and regtests on FC_21/x86_64 - OK for trunk?
> >>
> >
> > Patch looks ok to me.  One item I wonder if we need to care
> > about is old filesytems (FAT) or operating systems (VMS) with
> > a 3 character file extension limit.
> >

-- 
Steve


Re: [PR64164] drop copyrename, integrate into expand

2015-07-16 Thread Alexandre Oliva
On Jul 16, 2015, Richard Biener  wrote:

>> Is this ok to install?

> Yes.

So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
there are issues with split complex parms that caused go and fortran
libs to fail the build.

I will refrain from installing this for now, and I'll post a followup as
soon as I sort that out.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [Bug fortran/52846] [F2008] Support submodules - part 2/3 - redux

2015-07-16 Thread Douglas B Rupp
VMS hasn't had a 3 character filename extension limit since before 
VAX/VMS version 4.0 was released in 1984.


On 07/16/2015 01:41 PM, Steve Kargl wrote:

On Thu, Jul 16, 2015 at 05:08:50PM +0200, Paul Richard Thomas wrote:


Please find attached a new version of the patch that fixes the
inconsistency with the standard, pointed out by Reinhold. It is weird
but a read the appropriate part of the standard several times and
simply did not pick up the critical information :-)

Note that the delimiter used for submodule file name is '@', whereas
the internal identifiers is '.'.

I have added a procedure to cleanup submodules produced by the
testsuite and implemented them in submodule_[1-8].f90. Submodule_8.f90
tests the resolution of the spurious error found by Reinhold.

Booststraps and regtests on FC_21/x86_64 - OK for trunk?



Patch looks ok to me.  One item I wonder if we need to care
about is old filesytems (FAT) or operating systems (VMS) with
a 3 character file extension limit.



Re: [Bug fortran/52846] [F2008] Support submodules - part 2/3 - redux

2015-07-16 Thread Steve Kargl
On Thu, Jul 16, 2015 at 05:08:50PM +0200, Paul Richard Thomas wrote:
> 
> Please find attached a new version of the patch that fixes the
> inconsistency with the standard, pointed out by Reinhold. It is weird
> but a read the appropriate part of the standard several times and
> simply did not pick up the critical information :-)
> 
> Note that the delimiter used for submodule file name is '@', whereas
> the internal identifiers is '.'.
> 
> I have added a procedure to cleanup submodules produced by the
> testsuite and implemented them in submodule_[1-8].f90. Submodule_8.f90
> tests the resolution of the spurious error found by Reinhold.
> 
> Booststraps and regtests on FC_21/x86_64 - OK for trunk?
> 

Patch looks ok to me.  One item I wonder if we need to care
about is old filesytems (FAT) or operating systems (VMS) with
a 3 character file extension limit.

-- 
Steve


constify target offload data

2015-07-16 Thread Nathan Sidwell

Jakub, Ilya,
this patch against trunk constifies the offload target data.  I'm having 
difficulty building an intelmic toolchain, so the changes there aren't tested. 
Ilya, if you could check them, that'd be great.


nathan
2015-07-16  Nathan Sidwell  

	gcc/
	* config/nvptx/mkoffload.c (process): Constify target data.
	* config/i386/intelmic-mkoffload.c (generate_target_descr_file):
	Constify target data.
	(generate_target_offloadend_file): Likewise.

	libgomp/
	* target.c (struct offload_image_descr): Constify target_data.
	(gomp_offload_image_to_device): Likewise.
	(GOMP_offload_register): Likewise.
	(GOMP_offload_unrefister): Likewise.

Index: libgomp/target.c
===
--- libgomp/target.c	(revision 225897)
+++ libgomp/target.c	(working copy)
@@ -58,7 +58,7 @@ static gomp_mutex_t register_lock;
 struct offload_image_descr {
   enum offload_target_type type;
   void *host_table;
-  void *target_data;
+  const void *target_data;
 };
 
 /* Array of descriptors of offload images.  */
@@ -642,7 +642,7 @@ gomp_update (struct gomp_device_descr *d
 
 static void
 gomp_offload_image_to_device (struct gomp_device_descr *devicep,
-			  void *host_table, void *target_data,
+			  void *host_table, const void *target_data,
 			  bool is_register_lock)
 {
   void **host_func_table = ((void ***) host_table)[0];
@@ -731,7 +731,7 @@ gomp_offload_image_to_device (struct gom
 
 void
 GOMP_offload_register (void *host_table, enum offload_target_type target_type,
-		   void *target_data)
+		   const void *target_data)
 {
   int i;
   gomp_mutex_lock (®ister_lock);
@@ -765,7 +765,7 @@ GOMP_offload_register (void *host_table,
 
 void
 GOMP_offload_unregister (void *host_table, enum offload_target_type target_type,
-			 void *target_data)
+			 const void *target_data)
 {
   void **host_func_table = ((void ***) host_table)[0];
   void **host_funcs_end  = ((void ***) host_table)[1];
Index: libgomp/plugin/plugin-nvptx.c
===
--- libgomp/plugin/plugin-nvptx.c	(revision 225897)
+++ libgomp/plugin/plugin-nvptx.c	(working copy)
@@ -334,7 +334,7 @@ struct ptx_event
 
 struct ptx_image_data
 {
-  void *target_data;
+  const void *target_data;
   CUmodule module;
   struct ptx_image_data *next;
 };
@@ -1633,7 +1633,7 @@ typedef struct nvptx_tdata
 } nvptx_tdata_t;
 
 int
-GOMP_OFFLOAD_load_image (int ord, void *target_data,
+GOMP_OFFLOAD_load_image (int ord, const void *target_data,
 			 struct addr_pair **target_table)
 {
   CUmodule module;
@@ -1641,7 +1641,7 @@ GOMP_OFFLOAD_load_image (int ord, void *
   unsigned int fn_entries, var_entries, i, j;
   CUresult r;
   struct targ_fn_descriptor *targ_fns;
-  nvptx_tdata_t const *img_header = (nvptx_tdata_t const *) target_data;
+  const nvptx_tdata_t *img_header = (const nvptx_tdata_t *) target_data;
   struct ptx_image_data *new_image;
 
   GOMP_OFFLOAD_init_device (ord);
@@ -1704,9 +1704,10 @@ GOMP_OFFLOAD_load_image (int ord, void *
 }
 
 void
-GOMP_OFFLOAD_unload_image (int tid __attribute__((unused)), void *target_data)
+GOMP_OFFLOAD_unload_image (int tid __attribute__((unused)),
+			   const void *target_data)
 {
-  void **img_header = (void **) target_data;
+  const void *const *img_header = (const void *const *) target_data;
   struct targ_fn_descriptor *targ_fns
 = (struct targ_fn_descriptor *) img_header[0];
   struct ptx_image_data *image, *prev = NULL, *newhd = NULL;
Index: libgomp/plugin/plugin-host.c
===
--- libgomp/plugin/plugin-host.c	(revision 225897)
+++ libgomp/plugin/plugin-host.c	(working copy)
@@ -111,7 +111,7 @@ GOMP_OFFLOAD_fini_device (int n __attrib
 
 STATIC int
 GOMP_OFFLOAD_load_image (int n __attribute__ ((unused)),
-			 void *i __attribute__ ((unused)),
+			 const void *t __attribute__ ((unused)),
 			 struct addr_pair **r __attribute__ ((unused)))
 {
   return 0;
@@ -119,7 +119,7 @@ GOMP_OFFLOAD_load_image (int n __attribu
 
 STATIC void
 GOMP_OFFLOAD_unload_image (int n __attribute__ ((unused)),
-			   void *i __attribute__ ((unused)))
+			   const void *t __attribute__ ((unused)))
 {
 }
 
Index: libgomp/libgomp.h
===
--- libgomp/libgomp.h	(revision 225897)
+++ libgomp/libgomp.h	(working copy)
@@ -748,8 +748,8 @@ struct gomp_device_descr
   int (*get_num_devices_func) (void);
   void (*init_device_func) (int);
   void (*fini_device_func) (int);
-  int (*load_image_func) (int, void *, struct addr_pair **);
-  void (*unload_image_func) (int, void *);
+  int (*load_image_func) (int, const void *, struct addr_pair **);
+  void (*unload_image_func) (int, const void *);
   void *(*alloc_func) (int, size_t);
   void (*free_func) (int, void *);
   void *(*dev2host_func) (int, void *, const void *, size_t);
Index: liboffloadmic/plugin/libgomp-plugin-intelmic.cpp

[PATCH] Fix-up for PR fortran/66724 and fortran/66725

2015-07-16 Thread Steve Kargl
The attached patch is needed to fully address PR fortran/66724 and
fortran/66725.  In is_char_type(), we need to call gfc_resolve_expr
to properly resolve the tag, e.g., ASYNCHRONOUS="Y"//"E"//"S".
At the same time, I forgot call is_char_type for ASYNCHRONOUS.
Regression tested on trunk.  OK to commit?

2015-07-16  Steven G. Kargl  

* io.c (is_char_type): Call gfc_resolve_expr().
(match_open_element, match_dt_element, match_inquire_element): Fix
ASYNCHRONOUS case.

-- 
Steve
Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c	(revision 225843)
+++ gcc/fortran/io.c	(working copy)
@@ -1260,6 +1260,8 @@ check_char_variable (gfc_expr *e)
 static bool
 is_char_type (const char *name, gfc_expr *e)
 {
+  gfc_resolve_expr (e);
+
   if (e->ts.type != BT_CHARACTER)
 {
   gfc_error ("%s requires a scalar-default-char-expr at %L",
@@ -1580,6 +1582,8 @@ match_open_element (gfc_open *open)
   match m;
 
   m = match_etag (&tag_e_async, &open->asynchronous);
+  if (m == MATCH_YES && !is_char_type ("ASYNCHRONOUS", open->asynchronous))
+return MATCH_ERROR;
   if (m != MATCH_NO)
 return m;
   m = match_etag (&tag_unit, &open->unit);
@@ -2752,6 +2756,8 @@ match_dt_element (io_kind k, gfc_dt *dt)
 }
 
   m = match_etag (&tag_e_async, &dt->asynchronous);
+  if (m == MATCH_YES && !is_char_type ("ASYNCHRONOUS", dt->asynchronous))
+return MATCH_ERROR;
   if (m != MATCH_NO)
 return m;
   m = match_etag (&tag_e_blank, &dt->blank);
@@ -3986,6 +3992,8 @@ match_inquire_element (gfc_inquire *inqu
   RETM m = match_vtag (&tag_write, &inquire->write);
   RETM m = match_vtag (&tag_readwrite, &inquire->readwrite);
   RETM m = match_vtag (&tag_s_async, &inquire->asynchronous);
+  if (m == MATCH_YES && !is_char_type ("ASYNCHRONOUS", inquire->asynchronous))
+return MATCH_ERROR;
   RETM m = match_vtag (&tag_s_delim, &inquire->delim);
   RETM m = match_vtag (&tag_s_decimal, &inquire->decimal);
   RETM m = match_out_tag (&tag_size, &inquire->size);


[PATCH, committed] jit: Add guide for submitting patches to jit docs

2015-07-16 Thread David Malcolm
Committed to trunk as r225905.

gcc/jit/ChangeLog:
* docs/internals/index.rst (Overview of code structure): Add note
that the implementation is in C++, despite the .c extension.
(Submitting patches): New subsection.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
---
 gcc/jit/docs/internals/index.rst | 98 
 1 file changed, 98 insertions(+)

diff --git a/gcc/jit/docs/internals/index.rst b/gcc/jit/docs/internals/index.rst
index d0852f9..6f28762 100644
--- a/gcc/jit/docs/internals/index.rst
+++ b/gcc/jit/docs/internals/index.rst
@@ -287,6 +287,9 @@ For example:
 Overview of code structure
 --
 
+The library is implemented in C++.  The source files have the ``.c``
+extension for legacy reasons.
+
 * ``libgccjit.c`` implements the API entrypoints.  It performs error
   checking, then calls into classes of the gcc::jit::recording namespace
   within ``jit-recording.c`` and ``jit-recording.h``.
@@ -335,3 +338,98 @@ should be rejected via additional checking.  The checking 
ideally should
 be within the libgccjit API entrypoints in libgccjit.c, since this is as
 close as possible to the error; failing that, a good place is within
 ``recording::context::validate ()`` in jit-recording.c.
+
+Submitting patches
+--
+Please read the contribution guidelines for gcc at
+https://gcc.gnu.org/contribute.html.
+
+Patches for the jit should be sent to both the
+gcc-patches@gcc.gnu.org and j...@gcc.gnu.org mailing lists,
+with "jit" and "PATCH" in the Subject line.
+
+You don't need to do a full bootstrap for code that just touches the
+``jit`` and ``testsuite/jit.dg`` subdirectories.  However, please run
+``make check-jit`` before submitting the patch, and mention the results
+in your email (along with the host triple that the tests were run on).
+
+A good patch should contain the information listed in the
+gcc contribution guide linked to above; for a ``jit`` patch, the patch
+shold contain:
+
+  * the code itself (for example, a new API entrypoint will typically
+touch ``libgccjit.h`` and ``.c``, along with support code in
+``jit-recording.[ch]`` and ``jit-playback.[ch]`` as appropriate)
+
+  * test coverage
+
+  * documentation for the C API
+
+  * documentation for the C++ API
+
+A patch that adds new API entrypoints should also contain:
+
+  * a feature macro in ``libgccjit.h`` so that client code that doesn't
+use a "configure" mechanism can still easily detect the presence of
+the entrypoint.  See e.g. ``LIBGCCJIT_HAVE_SWITCH_STATEMENTS`` (for
+a category of entrypoints) and
+``LIBGCCJIT_HAVE_gcc_jit_context_set_bool_allow_unreachable_blocks``
+(for an individual entrypoint).
+
+  * a new ABI tag containing the new symbols (in ``libgccjit.map``), so
+that we can detect client code that uses them
+
+  * Support for :c:func:`gcc_jit_context_dump_reproducer_to_file`.  Most
+jit testcases attempt to dump their contexts to a .c file; ``jit.exp``
+then sanity-checks the generated c by compiling them (though
+not running them).   A new API entrypoint
+needs to "know" how to write itself back out to C (by implementing
+``gcc::jit::recording::memento::write_reproducer`` for the appropriate
+``memento`` subclass).
+
+  * C++ bindings for the new entrypoints (see ``libgccjit++.h``); ideally
+with test coverage, though the C++ API test coverage is admittedly
+spotty at the moment
+
+  * documentation for the new C entrypoints
+
+  * documentation for the new C++ entrypoints
+
+  * documentation for the new ABI tag (see ``topics/compatibility.rst``).
+
+Depending on the patch you can either extend an existing test case, or
+add a new test case.  If you add an entirely new testcase: ``jit.exp``
+expects jit testcases to begin with ``test-``, or ``test-error-`` (for a
+testcase that generates an error on a :c:type:`gcc_jit_context`).
+
+Every new testcase that doesn't generate errors should also touch
+``gcc/testsuite/jit.dg/all-non-failing-tests.h``:
+
+  * Testcases that don't generate errors should ideally be added to the
+``testcases`` array in that file; this means that, in addition
+to being run standalone, they also get run within
+``test-combination.c`` (which runs all successful tests inside one
+big :c:type:`gcc_jit_context`), and ``test-threads.c`` (which runs all
+successful tests in one process, each one running in a different
+thread on a different :c:type:`gcc_jit_context`).
+
+.. note::
+
+   Given that exported functions within a :c:type:`gcc_jit_context`
+   must have unique names, and most testcases are run within
+   ``test-combination.c``, this means that every jit-compiled test
+   function typically needs a name that's unique across the entire
+   test suite.
+
+  * Testcases that aren't to be added to the ``testcases`` array should
+instead add a comment to the file clarifying why the

Re: [PATCH][4/n] Remove GENERIC stmt combining from SCCVN

2015-07-16 Thread Andrew MacLeod

On 07/16/2015 07:54 AM, Andrew MacLeod wrote:

On 07/16/2015 03:27 AM, Richard Biener wrote:

On Wed, 15 Jul 2015, Andrew MacLeod wrote:

admittedly neither situation is very common I suspect, but it does 
seem like a

hidden gotchya waiting to happen.

I guess we either want to checking-assert that we never hit that
special marker or handle it appropriately.  Or even better avoid
it in the first place (not sure why we have it - I suppose to allow
modifying immediate uses of the current stmt from inside
FOR_EACH_IMM_USE_STMT).

For me single_imm_use_1 crashed on the NULL USE_STMT at

 if (!is_gimple_debug (USE_STMT (ptr)))

so I presume all was fine until debug stmts were introduced
(well, fine as in not crashing, not as in giving correct answers).


yes, It was probably still wrong, we just erred reporting that a real 
single_use statemen't wasn't.


 The marker is unique in that the STMT field is NULL, which can't 
happen otherwise.


I'll think about how to efficiently get this right

Andrew

We need to keep the marker because its we need it to update in a list 
that is being iterated upon.


The affected routines are has_zero_uses, has_single_use, single_imm_use, 
and num_imm_uses.  They can all be impacted by the presence of the  marker.


The fix the issue, we need to check USE_STMT for null before checking 
for is_gimple_debug() in all cases.


It seemed to me that the shortcutting I was taking to check the simple 
cases first may not be saving us much... it looked like the executed 
code path ought to be pretty similar to a simple loop... similar number 
of compares and instructions and jumps.So I implemented 
has_zero_uses() and has_single_use() directly with a loop, and compared 
the generated code at -O.


sure enough,  the code path was pretty darn close.   In fact, the 
routines ought to be generally faster.. without the shortcut tests the 
cases for 2 or more entries no longer require a function call, and they 
get a 2 entries "headstart" into the list since those first 2 iterations 
were basically duplicated by the shortcut code.


num_imm_uses is slightly slower, but its only called from one or 2 
places, and its not THAT much slower.


Finally, given the extra work that single_imm_uses does, It seemed 
prudent to mostly leave it alone, just add the check.


Want to give this a try and make sure it resolves the issue?

It bootstraps on x86_64-unknown-linux-gnu and shows no new regressions.
OK for trunk?

Andrew

PS, assembly at -O2 for :

// New version
bool tryit1(tree t)
{
  return has_single_use (t);   // new implementation
}
<..>
movq48(%rdi), %rdx
leaq40(%rdi), %rsi
xorl%eax, %eax
cmpq%rdx, %rsi
je  .L4154
.p2align 4,,10
.p2align 3
.L4152:
movq16(%rdx), %rcx
testq   %rcx, %rcx
je  .L4150
cmpb$2, (%rcx)
je  .L4150
testb   %al, %al
jne .L4155
movl$1, %eax
.L4150:
movq8(%rdx), %rdx
cmpq%rdx, %rsi
jne .L4152
rep ret
// Old version
bool tryit2(tree t)
{
  return has_single_use2 (t);  // old implemenation
}

<...>
movq%rdi, %rax
movq48(%rax), %rax
leaq40(%rdi), %rdi
cmpq%rax, %rdi
je  .L4168
cmpq8(%rax), %rdi
je  .L4171
xorl%edx, %edx
xorl%esi, %esi
jmp 
_Z16single_imm_use_1PK17ssa_use_operand_tPPS_PP21gimple_statement_base@PLT

.p2align 4,,10
.p2align 3
.L4171:
movq16(%rax), %rax
testq   %rax, %rax
je  .L4168
cmpb$2, (%rax)
setne   %al
ret
.p2align 4,,10
.p2align 3
.L4168:
xorl%eax, %eax
ret



	* ssa-iterators.h (has_zero_uses, has_single_use): Implement as
	straight loops.
	(single_imm_use): Check for iterator node.
	(num_imm_uses): Likewise.
	* tree-ssa-operands.c (has_zero_uses_1): Delete.
	(single_imm_use_1): Check for iterator node.

Index: ssa-iterators.h
===
*** ssa-iterators.h	(revision 225871)
--- ssa-iterators.h	(working copy)
*** struct imm_use_iterator
*** 114,120 
  
  
  
- extern bool has_zero_uses_1 (const ssa_use_operand_t *head);
  extern bool single_imm_use_1 (const ssa_use_operand_t *head,
  			  use_operand_p *use_p, gimple *stmt);
  
--- 114,119 
*** next_readonly_imm_use (imm_use_iterator
*** 379,420 
  static inline bool
  has_zero_uses (const_tree var)
  {
!   const ssa_use_operand_t *const ptr = &(SSA_NAME_IMM_USE_NODE (var));
! 
!   /* A single use_operand means there is no items in the list.  */
!   if (ptr == ptr->next)
! return true;
  
!   /* If there are debug stmts, we have to look at each use and see
!  whether there are any nondebug uses.  */
!   if (!MAY_HAVE_DEBUG_STMTS)
! return false;
  
!   return ha

Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov


On 16/07/15 19:28, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?

You mean move this hunk outside the "if (BINARY_P (x)...)" block it's in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function to call
so frequently (or is it? I didn't profile it), so I tried to avoid calling
it unless I need for the transformation in question here.

I mean move the whole "if (BINARY_P ..." block to after the existing
simplify calls, to just before the "First see if we can apply" comment,
and not do a new simplify_rtx call at all.  Does that work?


Yes, it does the transformation I want :) if it's combined (pardon the pun)
with the simplify-rtx.c patch at 
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01433.html



Which brings the question why it wasn't there in the first place, hrm.


Dunno, I'll test this approach more thoroughly tomorrow, check the impact on
some codebases and propose a patch if it all works out.

Thanks for the help,
Kyrill




Segher





Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Segher Boessenkool
On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:
> >If you always want to simplify first, does it work to move this whole big
> >block behind the simplify just following it?  Or do you want to simplify
> >after the transform as well?
> 
> You mean move this hunk outside the "if (BINARY_P (x)...)" block it's in?
> I think it would work, but I'm not sure if it would affect other cases.
> I was also conscious that simplify_rtx might not be a cheap function to call
> so frequently (or is it? I didn't profile it), so I tried to avoid calling
> it unless I need for the transformation in question here.

I mean move the whole "if (BINARY_P ..." block to after the existing
simplify calls, to just before the "First see if we can apply" comment,
and not do a new simplify_rtx call at all.  Does that work?

Which brings the question why it wasn't there in the first place, hrm.


Segher


Re: [PATCH] libgcc: fix build with older make

2015-07-16 Thread Ian Lance Taylor
"Jan Beulich"  writes:

> 2015-07-16  Jan Beulich  
>
>   * config/t-softfp: Split up "else ifneq".

This is OK.

Thanks.

Ian


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov


On 16/07/15 19:13, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote:

Hi all,

This is an attempt to solve the problem in the thread starting at
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
in a generic way after some pointers from Segher and Andrew.

The problem I got was that combine_simplify_rtx was trying to
do some special handling of unary operations applied to if_then_else
but ended up exiting early due to:

   enum rtx_code cond_code = simplify_comparison (NE, &cond, &cop1);

   if (cond_code == NE && COMPARISON_P (cond))
 return x;

I tried removing that bug that led to regressions in SPEC2006.
The solution that worked for me led to two patches.

In this first patch we add a simplification step to the rtx before trying
any substitutions.
diff --git a/gcc/combine.c b/gcc/combine.c
index 574f874..40d2231 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  {
rtx cond, true_rtx, false_rtx;
  
+  /* If some simplification is possible from the start, try it now.  */

+  temp = simplify_rtx (x);
+
+  if (temp)
+   {
+ x = temp;
+ code = GET_CODE (x);
+ mode = GET_MODE (x);
+ op0_mode = VOIDmode;
+   }
+
cond = if_then_else_cond (x, &true_rtx, &false_rtx);
if (cond != 0
  /* If everything is a comparison, what we have is highly unlikely

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?


You mean move this hunk outside the "if (BINARY_P (x)...)" block it's in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function to call
so frequently (or is it? I didn't profile it), so I tried to avoid calling
it unless I need for the transformation in question here.

Kyrill




Segher





Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Segher Boessenkool
On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> This is an attempt to solve the problem in the thread starting at
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
> in a generic way after some pointers from Segher and Andrew.
> 
> The problem I got was that combine_simplify_rtx was trying to
> do some special handling of unary operations applied to if_then_else
> but ended up exiting early due to:
> 
>   enum rtx_code cond_code = simplify_comparison (NE, &cond, &cop1);
> 
>   if (cond_code == NE && COMPARISON_P (cond))
> return x;
> 
> I tried removing that bug that led to regressions in SPEC2006.
> The solution that worked for me led to two patches.
> 
> In this first patch we add a simplification step to the rtx before trying 
> any substitutions.

> diff --git a/gcc/combine.c b/gcc/combine.c
> index 574f874..40d2231 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, 
> int in_dest,
>  {
>rtx cond, true_rtx, false_rtx;
>  
> +  /* If some simplification is possible from the start, try it now.  */
> +  temp = simplify_rtx (x);
> +
> +  if (temp)
> + {
> +   x = temp;
> +   code = GET_CODE (x);
> +   mode = GET_MODE (x);
> +   op0_mode = VOIDmode;
> + }
> +
>cond = if_then_else_cond (x, &true_rtx, &false_rtx);
>if (cond != 0
> /* If everything is a comparison, what we have is highly unlikely

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?


Segher


Re: [GOMP] a struct for offload target data

2015-07-16 Thread Nathan Sidwell

On 07/13/15 10:33, Bernd Schmidt wrote:

On 07/13/2015 03:57 PM, Nathan Sidwell wrote:

this patch changes the offload target data type from an array of void *,
to a struct, which is somewhat easier to deal with than remembering
numeric indices and type casts.

This is step  1 in reworking the launch API.


Looks fine.


This is the version applied to trunk.

nathan

2015-07-16  Nathan Sidwell  

	libgomp/
	* plugin/plugin-nvptx.c (link_ptx): Constify string argument.
	Workaround driver library const error.
	(struct nvptx_tdata, nvptx_tdata_t): New.
	(GOMP_OFFLOAD_load_image): Use struct for target_data's real
	type.

	gcc/
	* config/nvptx/mkoffload.c (process): Constify mapping variables.
	Define target data struct and initialize it.

Index: libgomp/plugin/plugin-nvptx.c
===
--- libgomp/plugin/plugin-nvptx.c	(revision 225885)
+++ libgomp/plugin/plugin-nvptx.c	(working copy)
@@ -804,7 +804,7 @@ nvptx_get_num_devices (void)
 
 
 static void
-link_ptx (CUmodule *module, char *ptx_code)
+link_ptx (CUmodule *module, const char *ptx_code)
 {
   CUjit_option opts[7];
   void *optvals[7];
@@ -874,7 +874,8 @@ link_ptx (CUmodule *module, char *ptx_co
 			 cuda_error (r));
 }
 
-  r = cuLinkAddData (linkstate, CU_JIT_INPUT_PTX, ptx_code,
+  /* cuLinkAddData's 'data' argument erroneously omits the const qualifier.  */
+  r = cuLinkAddData (linkstate, CU_JIT_INPUT_PTX, (char *)ptx_code,
   strlen (ptx_code) + 1, 0, 0, 0, 0);
   if (r != CUDA_SUCCESS)
 {
@@ -1618,23 +1619,36 @@ GOMP_OFFLOAD_fini_device (int n)
   pthread_mutex_unlock (&ptx_dev_lock);
 }
 
+/* Data emitted by mkoffload.  */
+
+typedef struct nvptx_tdata
+{
+  const char *ptx_src;
+
+  const char *const *var_names;
+  size_t var_num;
+
+  const char *const *fn_names;
+  size_t fn_num;
+} nvptx_tdata_t;
+
 int
 GOMP_OFFLOAD_load_image (int ord, void *target_data,
 			 struct addr_pair **target_table)
 {
   CUmodule module;
-  char **fn_names, **var_names;
+  const char *const *fn_names, *const *var_names;
   unsigned int fn_entries, var_entries, i, j;
   CUresult r;
   struct targ_fn_descriptor *targ_fns;
-  void **img_header = (void **) target_data;
+  nvptx_tdata_t const *img_header = (nvptx_tdata_t const *) target_data;
   struct ptx_image_data *new_image;
 
   GOMP_OFFLOAD_init_device (ord);
 
   nvptx_attach_host_thread_to_device (ord);
 
-  link_ptx (&module, img_header[0]);
+  link_ptx (&module, img_header->ptx_src);
 
   pthread_mutex_lock (&ptx_image_lock);
   new_image = GOMP_PLUGIN_malloc (sizeof (struct ptx_image_data));
@@ -1644,22 +1658,14 @@ GOMP_OFFLOAD_load_image (int ord, void *
   ptx_images = new_image;
   pthread_mutex_unlock (&ptx_image_lock);
 
-  /* The mkoffload utility emits a table of pointers/integers at the start of
- each offload image:
-
- img_header[0] -> ptx code
- img_header[1] -> number of variables
- img_header[2] -> array of variable names (pointers to strings)
- img_header[3] -> number of kernels
- img_header[4] -> array of kernel names (pointers to strings)
-
- The array of kernel names and the functions addresses form a
- one-to-one correspondence.  */
-
-  var_entries = (uintptr_t) img_header[1];
-  var_names = (char **) img_header[2];
-  fn_entries = (uintptr_t) img_header[3];
-  fn_names = (char **) img_header[4];
+  /* The mkoffload utility emits a struct of pointers/integers at the
+ start of each offload image.  The array of kernel names and the
+ functions addresses form a one-to-one correspondence.  */
+
+  var_entries = img_header->var_num;
+  var_names = img_header->var_names;
+  fn_entries = img_header->fn_num;
+  fn_names = img_header->fn_names;
 
   *target_table = GOMP_PLUGIN_malloc (sizeof (struct addr_pair)
   * (fn_entries + var_entries));
Index: gcc/config/nvptx/mkoffload.c
===
--- gcc/config/nvptx/mkoffload.c	(revision 225885)
+++ gcc/config/nvptx/mkoffload.c	(working copy)
@@ -842,7 +842,6 @@ process (FILE *in, FILE *out)
 {
   const char *input = read_file (in);
   Token *tok = tokenize (input);
-  unsigned int nvars = 0, nfuncs = 0;
 
   do
 tok = parse_file (tok);
@@ -853,19 +852,30 @@ process (FILE *in, FILE *out)
   write_stmts (out, rev_stmts (vars));
   write_stmts (out, rev_stmts (fns));
   fprintf (out, ";\n\n");
-  fprintf (out, "static const char *var_mappings[] = {\n");
-  for (id_map *id = var_ids; id; id = id->next, nvars++)
+
+  fprintf (out, "static const char *const var_mappings[] = {\n");
+  for (id_map *id = var_ids; id; id = id->next)
 fprintf (out, "\t\"%s\"%s\n", id->ptx_name, id->next ? "," : "");
   fprintf (out, "};\n\n");
-  fprintf (out, "static const char *func_mappings[] = {\n");
-  for (id_map *id = func_ids; id; id = id->next, nfuncs++)
+  fprintf (out, "static const char *const func_mappings[] = {\n");
+  for (id_map *id = func_ids; id; id = id->next)
 fprintf

Re: [gomp4] Remove device-specific filtering during parsing for OpenACC

2015-07-16 Thread Nathan Sidwell

On 07/16/15 11:32, Julian Brown wrote:

Hi,

This patch removes the device-specific filtering (for NVidia PTX) from
the parsing stages of the host compiler (for the device_type clause --
separately for C, C++ and Fortran) in favour of fully parsing the
device_type clauses, but not actually implementing anything for them
(device_type support is a feature that we're not planning to implement
just yet: the existing "support" is something of a red herring).

With this patch, the parsed device_type clauses will be ready at OMP
lowering time whenever we choose to do something with them (e.g.
transforming them into a representation that can be streamed out and
re-read by the appropriate offload compiler). The representation is
more-or-less the same for all supported languages, modulo
clause ordering.

I've altered the dtype-*.* tests to account for the new behaviour (and
to not use e.g. mixed-case "nVidia" or "acc_device_nvidia" names, which
are contrary to the recommendations in the spec).

OK to apply, or any comments?


thanks!


--
Nathan Sidwell


[gomp] Fix PTX worker spill/fill

2015-07-16 Thread Nathan Sidwell
I've committed this patch to fix a bug in the worker spill/fill code.  We ended 
up not incrementing the pointer, resulting in the stack frame being filled with 
the same value.


Thanks to Jim for finding the failure.

nathan
2015-07-16  Nathan Sidwell  

	* config/nvptx/nvptx.c (nvptx_gen_wcast): Fix typo accessing reg's
	mode for pointer increment.

Index: config/nvptx/nvptx.c
===
--- config/nvptx/nvptx.c	(revision 225831)
+++ config/nvptx/nvptx.c	(working copy)
@@ -1257,7 +1257,7 @@ nvptx_gen_wcast (rtx reg, propagate_mask
 	
 	emit_insn (res);
 	emit_insn (gen_adddi3 (data->ptr, data->ptr,
-   GEN_INT (GET_MODE_SIZE (GET_MODE (res);
+   GEN_INT (GET_MODE_SIZE (GET_MODE (reg);
 	res = get_insns ();
 	end_sequence ();
 	  }


[gomp4.1] Handle #pragma omp target {simd,parallel{, for{, simd}}

2015-07-16 Thread Jakub Jelinek
Hi!

This patch adds support for 4 new combined constructs and fixes various
issues in the clause splitting code and other issues I found on the
new testcases.

2015-07-16  Jakub Jelinek  

gcc/
* omp-low.c (expand_omp_build_assign): Add prototype.  Add AFTER
argument, if true emit statements after *GSI_P and continue linking.
(expand_parallel_call): Use expand_omp_build_assign.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Handle new 4 combined constructs.
Handle OMP_CLAUSE_DEFAULTMAP.  Document 2 missing combined constructs.
Fix up OMP_CLAUSE_FIRSTPRIVATE handling on #pragma omp distribute simd.
Fix up shared/default handling.  Add ENABLE_CHECKING verification.
gcc/c/
* c-parser.c (c_parser_omp_parallel): Allow parsing
#pragma omp target parallel.
(c_parser_omp_target): Allow parsing #pragma omp target simd
and #pragma omp target parallel{, for{, simd}}.
gcc/cp/
* parser.c (cp_parser_omp_clause_priority): Fix typo.
(cp_parser_omp_parallel): Allow parsing
#pragma omp target parallel.
(cp_parser_omp_target): Allow parsing #pragma omp target simd
and #pragma omp target parallel{, for{, simd}}.
gcc/testsuite/
* c-c++-common/gomp/clauses-1.c: New test.
libgomp/
* testsuite/libgomp.c/for-2.h (OMPTGT, OMPTO, OMPFROM): Define
if not already defined.
(N(f0), N(f1), N(f2), N(f3), N(f4), N(f5), N(f6), N(f7), N(f8),
N(f9), N(f10), N(f11), N(f12), N(f13), N(f14)): Use OMPTGT macro.
(N(test)): Use OMPTO and OMPFROM macros.
* testsuite/libgomp.c/for-5.c: New test.
* testsuite/libgomp.c/for-6.c: New test.
* testsuite/libgomp.c++/for-13.C: New test.
* testsuite/libgomp.c++/for-14.C: New test.

--- gcc/omp-low.c.jj2015-07-15 13:00:32.0 +0200
+++ gcc/omp-low.c   2015-07-16 16:28:36.429055322 +0200
@@ -5637,6 +5637,8 @@ gimple_build_cond_empty (tree cond)
   return gimple_build_cond (pred_code, lhs, rhs, NULL_TREE, NULL_TREE);
 }
 
+static void expand_omp_build_assign (gimple_stmt_iterator *, tree, tree,
+bool = false);
 
 /* Build the function calls to GOMP_parallel_start etc to actually
generate the parallel operation.  REGION is the parallel region
@@ -5754,13 +5756,12 @@ expand_parallel_call (struct omp_region
  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
 
  gsi = gsi_start_bb (then_bb);
- stmt = gimple_build_assign (tmp_then, val);
- gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+ expand_omp_build_assign (&gsi, tmp_then, val, true);
 
  gsi = gsi_start_bb (else_bb);
- stmt = gimple_build_assign
-  (tmp_else, build_int_cst (unsigned_type_node, 1));
- gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+ expand_omp_build_assign (&gsi, tmp_else,
+  build_int_cst (unsigned_type_node, 1),
+  true);
 
  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
  make_edge (cond_bb, else_bb, EDGE_FALSE_VALUE);
@@ -6221,16 +6222,21 @@ expand_omp_regimplify_p (tree *tp, int *
   return NULL_TREE;
 }
 
-/* Prepend TO = FROM assignment before *GSI_P.  */
+/* Prepend or append TO = FROM assignment before or after *GSI_P.  */
 
 static void
-expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
+expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from,
+bool after)
 {
   bool simple_p = DECL_P (to) && TREE_ADDRESSABLE (to);
   from = force_gimple_operand_gsi (gsi_p, from, simple_p, NULL_TREE,
-  true, GSI_SAME_STMT);
+  !after, after ? GSI_CONTINUE_LINKING
+: GSI_SAME_STMT);
   gimple stmt = gimple_build_assign (to, from);
-  gsi_insert_before (gsi_p, stmt, GSI_SAME_STMT);
+  if (after)
+gsi_insert_after (gsi_p, stmt, GSI_CONTINUE_LINKING);
+  else
+gsi_insert_before (gsi_p, stmt, GSI_SAME_STMT);
   if (walk_tree (&from, expand_omp_regimplify_p, NULL, NULL)
   || walk_tree (&to, expand_omp_regimplify_p, NULL, NULL))
 {
--- gcc/c-family/c-omp.c.jj 2015-07-15 13:02:31.0 +0200
+++ gcc/c-family/c-omp.c2015-07-16 15:11:12.725828803 +0200
@@ -684,28 +684,34 @@ c_finish_omp_for (location_t locus, enum
 }
 }
 
-/* Right now we have 15 different combined/composite constructs, this
+/* Right now we have 21 different combined/composite constructs, this
function attempts to split or duplicate clauses for combined
constructs.  CODE is the innermost construct in the combined construct,
and MASK allows to determine which constructs are combined together,
as every construct has at least one clause that no other construct
has (except for OMP_SECTIONS, but that can be only combined with par

Re: [PATCH][AArch64][4/14] Create TARGET_FIX_ERR_A53_835769 and use that instead of aarch64_fix_a53_err835769

2015-07-16 Thread Kyrill Tkachov

Sorry, had sent out the wrong version.
This is the right patch.

Thanks,
Kyrill

On 16/07/15 16:20, Kyrill Tkachov wrote:

Hi all,

This patch transforms the Cortex-A53 erratum 835769 workaround checks into a 
macro.
This way we don't have to override aarch64_fix_a53_err835769 in the default case
and this allows us to keep track of when the user doesn't specify this option,
which may come in handy later on when we decide the inlining rules.

This patch also makes TARGET_FIX_ERR_A53_835769_DEFAULT unconditionally defined 
to
0 or 1, so that we don't have to check it if #ifdefs.

Bootstrapped and tested as part of series on aarch64.
Checked that the workaround is applied as previously.

Ok for trunk?

2015-07-16  Kyrylo Tkachov  

  * config/aarch64/aarch64.h (TARGET_FIX_ERR_A53_835769_DEFAULT): Always
  define to 0 or 1.
  (TARGET_FIX_ERR_A53_835769): New macro.
  * config/aarch64/aarch64.c (aarch64_override_options_internal): Remove
  handling of opts->x_aarch64_fix_a53_err835769.
  (aarch64_madd_needs_nop): Check for TARGET_FIX_ERR_A53_835769 rather
  than aarch64_fix_a53_err835769.
  * config/aarch64/aarch64-elf-raw.h: Update for above changes.
  * config/aarch64/aarch64-linux.h: Likewise.


commit 2785b56070cd21c41ecca5a1f2e93bb8c400b1b4
Author: Kyrylo Tkachov 
Date:   Thu May 21 09:49:12 2015 +0100

[AArch64][4/N] Create TARGET_FIX_ERR_A53_835769 and use that instead of aarch64_fix_a53_err835769

diff --git a/gcc/config/aarch64/aarch64-elf-raw.h b/gcc/config/aarch64/aarch64-elf-raw.h
index bd5e51c..66b4c8b 100644
--- a/gcc/config/aarch64/aarch64-elf-raw.h
+++ b/gcc/config/aarch64/aarch64-elf-raw.h
@@ -27,7 +27,7 @@
   " crtend%O%s crtn%O%s " \
   "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}"
 
-#ifdef TARGET_FIX_ERR_A53_835769_DEFAULT
+#if TARGET_FIX_ERR_A53_835769_DEFAULT
 #define CA53_ERR_835769_SPEC \
   " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
 #else
diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h
index 1600a32..028ef98 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -44,7 +44,7 @@
%{mbig-endian:-EB} %{mlittle-endian:-EL} \
-maarch64linux%{mabi=ilp32:32}%{mbig-endian:b}"
 
-#ifdef TARGET_FIX_ERR_A53_835769_DEFAULT
+#if TARGET_FIX_ERR_A53_835769_DEFAULT
 #define CA53_ERR_835769_SPEC \
   " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
 #else
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5ea65e3..aff23d6 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7552,15 +7552,6 @@ aarch64_override_options_internal (struct gcc_options *opts)
   if (opts->x_flag_strict_volatile_bitfields < 0 && abi_version_at_least (2))
 opts->x_flag_strict_volatile_bitfields = 1;
 
-  if (opts->x_aarch64_fix_a53_err835769 == 2)
-{
-#ifdef TARGET_FIX_ERR_A53_835769_DEFAULT
-  opts->x_aarch64_fix_a53_err835769 = 1;
-#else
-  opts->x_aarch64_fix_a53_err835769 = 0;
-#endif
-}
-
   /* -mgeneral-regs-only sets a mask in target_flags, make sure that
  aarch64_isa_flags does not contain the FP/SIMD/Crypto feature flags
  in case some code tries reading aarch64_isa_flags directly to check if
@@ -9004,7 +8995,7 @@ aarch64_madd_needs_nop (rtx_insn* insn)
   rtx_insn *prev;
   rtx body;
 
-  if (!aarch64_fix_a53_err835769)
+  if (!TARGET_FIX_ERR_A53_835769)
 return false;
 
   if (recog_memoized (insn) < 0)
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 2a097af..d2d1ebf 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -233,6 +233,20 @@ extern unsigned long aarch64_isa_flags;
 /* CRC instructions that can be enabled through +crc arch extension.  */
 #define TARGET_CRC32 (AARCH64_ISA_CRC)
 
+/* Make sure this is always defined so we don't have to check for ifdefs
+   but rather use normal ifs.  */
+#ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
+#define TARGET_FIX_ERR_A53_835769_DEFAULT 0
+#else
+#undef TARGET_FIX_ERR_A53_835769_DEFAULT
+#define TARGET_FIX_ERR_A53_835769_DEFAULT 1
+#endif
+
+/* Apply the workaround for Cortex-A53 erratum 835769.  */
+#define TARGET_FIX_ERR_A53_835769	\
+  ((aarch64_fix_a53_err835769 == 2)	\
+  ? TARGET_FIX_ERR_A53_835769_DEFAULT : aarch64_fix_a53_err835769)
+
 /* Standard register usage.  */
 
 /* 31 64-bit general purpose registers R0-R30:


[PATCH][AArch64] Use cinc for if_then_else of plus-immediates

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch improves codegen for expressions of the form:
(x ? y + c1 : y + c2) when |c1 - c2| == 1

It matches the if_then_else of the two plus-immediates,
performs one of them, then generates a conditional increment
operation.

Thus, for the code in the testcase we generate a single add, compare
and cinc instruction rather than two adds, a compare and a csel.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*csel_plus6):
New define_insn_and_split.
(*csinc2_insn): Rename to...
(csinc2_insn): ... This.

2015-07-16  Kyrylo Tkachov  

* gcc.target/aarch64/cinc_common_1.c: New test.
commit a2dca37d3227ef4c9d3a8cc8277dd31529df74fd
Author: Kyrylo Tkachov 
Date:   Tue Jul 14 10:33:04 2015 +0100

[AArch64] Use cinc for if_then_else of plus-immediates

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 381bb1d..39282b7 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2843,6 +2843,48 @@ (define_expand "cmov6"
   "
 )
 
+/* Catch cases where we do:
+   add x2, x1, #n
+   add x3, x1, #(n + 1)
+   csel x4, x2, x3, cond
+   and transform it into:
+   add x2, x1, #n
+   cinc x4, x2, !cond.  */
+
+(define_insn_and_split "*csel_plus6"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(if_then_else:GPI
+	 (match_operator 1 "aarch64_comparison_operator"
+	  [(match_operand 2 "cc_register" "") (const_int 0)])
+	 (plus:GPI (match_operand:GPI 3 "aarch64_reg_or_imm" "r")
+		   (match_operand:GPI 4 "aarch64_plus_immediate" "n"))
+	 (plus:GPI (match_dup 3)
+		   (match_operand:GPI 5 "aarch64_plus_immediate" "n"]
+  "optimize > 0
+   && ((INTVAL (operands[4]) == INTVAL (operands[5]) + 1)
+	|| (INTVAL (operands[5]) == INTVAL (operands[4]) + 1))"
+  "#"
+  "&& !reload_completed"
+  [(const_int 0)]
+  {
+bool swap_p = INTVAL (operands[5]) > INTVAL (operands[4]);
+enum rtx_code code = GET_CODE (operands[1]);
+if (swap_p)
+  {
+	code = REVERSE_CONDITION (code, GET_MODE (operands[2]));
+	std::swap (operands[4], operands[5]);
+  }
+
+rtx tmp = gen_reg_rtx (mode);
+emit_insn (gen_add3 (tmp, operands[3], operands[5]));
+rtx comp = gen_rtx_fmt_ee (code, VOIDmode, operands[2], const0_rtx);
+
+emit_insn (gen_csinc2_insn (operands[0], tmp, comp));
+DONE;
+  }
+)
+
+
 (define_insn "*cmov_insn"
   [(set (match_operand:ALLI 0 "register_operand" "=r,r,r,r,r,r,r")
 	(if_then_else:ALLI
@@ -3030,7 +3072,7 @@ (define_insn "aarch64_"
   [(set_attr "type" "crc")]
 )
 
-(define_insn "*csinc2_insn"
+(define_insn "csinc2_insn"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 (plus:GPI (match_operand 2 "aarch64_comparison_operation" "")
   (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/cinc_common_1.c b/gcc/testsuite/gcc.target/aarch64/cinc_common_1.c
new file mode 100644
index 000..d041263
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cinc_common_1.c
@@ -0,0 +1,64 @@
+/* { dg-do run } */
+/* { dg-options "-save-temps -O2 -fno-inline" } */
+
+extern void abort (void);
+
+int
+foosi (int x)
+{
+  return x > 100 ? x - 2 : x - 1;
+}
+
+int
+barsi (int x)
+{
+  return x > 100 ? x + 4 : x + 3;
+}
+
+long
+foodi (long x)
+{
+  return x > 100 ? x - 2 : x - 1;
+}
+
+long
+bardi (long x)
+{
+  return x > 100 ? x + 4 : x + 3;
+}
+
+/* { dg-final { scan-assembler-times "cs?inc\tw\[0-9\]*" 2 } } */
+/* { dg-final { scan-assembler-times "cs?inc\tx\[0-9\]*" 2 } } */
+
+int
+main (void)
+{
+  if (foosi (105) != 103)
+abort ();
+
+  if (foosi (95) != 94)
+abort ();
+
+  if (barsi (105) != 109)
+abort ();
+
+  if (barsi (95) != 98)
+abort ();
+
+  if (foodi (105) != 103)
+abort ();
+
+  if (foodi (95) != 94)
+abort ();
+
+  if (bardi (105) != 109)
+abort ();
+
+  if (bardi (95) != 98)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "csel\tx\[0-9\]*.*" } } */
+/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */


[gomp4] Remove device-specific filtering during parsing for OpenACC

2015-07-16 Thread Julian Brown
Hi,

This patch removes the device-specific filtering (for NVidia PTX) from
the parsing stages of the host compiler (for the device_type clause --
separately for C, C++ and Fortran) in favour of fully parsing the
device_type clauses, but not actually implementing anything for them
(device_type support is a feature that we're not planning to implement
just yet: the existing "support" is something of a red herring).

With this patch, the parsed device_type clauses will be ready at OMP
lowering time whenever we choose to do something with them (e.g.
transforming them into a representation that can be streamed out and
re-read by the appropriate offload compiler). The representation is
more-or-less the same for all supported languages, modulo
clause ordering.

I've altered the dtype-*.* tests to account for the new behaviour (and
to not use e.g. mixed-case "nVidia" or "acc_device_nvidia" names, which
are contrary to the recommendations in the spec).

OK to apply, or any comments?

Thanks,

Julian

ChangeLog

gcc/
* gimplify.c (gimplify_scan_omp_clauses): Handle
OMP_CLAUSE_DEVICE_TYPE.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
(expand_omp_target): Add "sorry" for device_type support.
* tree-pretty-print.c (dump_omp_clause): Add device_type support.
* tree.c (walk_tree_1): Likewise.

gcc/c/
* c-parser.c (c_parser_oacc_all_clauses): Don't call
c_oacc_filter_device_types.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_DEVICE_TYPE.

gcc/cp/
* parser.c (cp_parser_oacc_all_clauses): Don't call
c_oacc_filter_device_types.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_DEVICE_TYPE.
* semantics.c (finish_omp_clauses): Likewise.

gcc/fortran/
* gfortran.h (gfc_omp_clauses): Change "dtype" int field to
"device_types" gfc_expr_list.
* openmp.c (gfc_match_omp_clauses): Remove scan_dtype variable (add
OMP_CLAUSE_DEVICE_TYPE directly to appropriate bitmasks). Parse all
device_type clauses without filtering.
(OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK)
(OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK)
(OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK)
(OACC_ROUTINE_CLAUSE_DEVICE_TYPE_MASK)
(OACC_UPDATE_CLAUSE_DEVICE_TYPE_MASK): Add OMP_CLAUSE_DEVICE_TYPE.
* trans-openmp.c (gfc_trans_omp_clauses): Translate device_type
clauses, and split old body into...
(gfc_trans_omp_clauses_1): New function.

gcc/testsuite/
* c-c++-common/goacc/dtype-1.c: Update test for new behaviour.
* c-c++-common/goacc/dtype-2.c: Likewise.
* c-c++-common/goacc/dtype-3.c: Likewise.
* c-c++-common/goacc/dtype-4.c: Likewise.
* gfortran.dg/goacc/dtype-1.f95: Likewise.
* gfortran.dg/goacc/dtype-2.f95: Likewise.
* gfortran.dg/goacc/dtype-3.f: Likewise.commit 123298186bb8ce87f84b6a3a72743939d4fdae11
Author: Julian Brown 
Date:   Thu Jul 16 08:06:01 2015 -0700

Fix device_type parsing, add sorry() for missing implementation of remainder.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 1c65abf..d90c18e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -12439,10 +12439,7 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
   c_parser_skip_to_pragma_eol (parser);
 
   if (finish_p)
-{
-  clauses = c_oacc_filter_device_types (clauses);
-  return c_finish_omp_clauses (clauses, true);
-}
+return c_finish_omp_clauses (clauses, true);
 
   return clauses;
 }
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 98b8e3d..dcc246c 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12568,6 +12568,10 @@ c_finish_omp_clauses (tree clauses, bool oacc)
 	  pc = &OMP_CLAUSE_CHAIN (c);
 	  continue;
 
+case OMP_CLAUSE_DEVICE_TYPE:
+	  pc = &OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
+	  continue;
+
 	case OMP_CLAUSE_INBRANCH:
 	case OMP_CLAUSE_NOTINBRANCH:
 	  if (branch_seen)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 28f0048..80aabed 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29879,10 +29879,7 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
   cp_parser_skip_to_pragma_eol (parser, pragma_tok);
 
   if (finish_p)
-{
-  clauses = c_oacc_filter_device_types (clauses);
-  return finish_omp_clauses (clauses, true);
-}
+return finish_omp_clauses (clauses, true);
 
   return clauses;
 }
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 205dc30..056b2c1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13666,6 +13666,7 @@ tsubst_omp_clauses (tree clauses, bool declare_simd,
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 	default:
 	  gcc_unreachable ();
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 8935eb6..1ce1dfa 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5951,6 +5951,7 @@ finish_omp_clauses (tree clauses, bool oacc)
 	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE_TILE:
+	ca

[PATCH][ARM][testsuite][committed] Do not override -mcpu in no-volatile-in-it.c

2015-07-16 Thread Kyrill Tkachov

Hi all,

This scan-assembler test was failing for me when testing with an explicit 
/-march=armv7-a variant because
it clashed with the -mcpu=cortex-m7 and overrode it.

This patch skips the test if the user forces an incompatible -march or -mcpu 
option.
The test now appears as UNSUPPORTED in these conditions and PASSes normally.

Applied as obvious with r225892.

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* gcc.target/arm/no-volatile-in-it.c: Skip if -mcpu is overriden.
commit 1c90fa1c2853644fb67999e726761c6add649c39
Author: Kyrylo Tkachov 
Date:   Wed Jul 8 11:40:33 2015 +0100

[ARM][testsuite] Do not override -mcpu in no-volatile-in-it.c

diff --git a/gcc/testsuite/gcc.target/arm/no-volatile-in-it.c b/gcc/testsuite/gcc.target/arm/no-volatile-in-it.c
index 206afdb..6f3664d 100644
--- a/gcc/testsuite/gcc.target/arm/no-volatile-in-it.c
+++ b/gcc/testsuite/gcc.target/arm/no-volatile-in-it.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "do not override -mcpu" { *-*-* } { "-march=*" "-mcpu=*" } { "-mcpu=cortex-m7" } } */
 /* { dg-options "-Os -mthumb -mcpu=cortex-m7" } */
 
 int


[PATCH][simplify-rtx][2/2] Simplify - (y ? -x : x) -> (!y ? -x : x

2015-07-16 Thread Kyrill Tkachov

Hi all,

In this second patch I add the transformation mentioned in the subject to 
simplify-rtx.c.
In combination with the first patch to combine, combine_simplify_rtx now picks 
it up in the
testcase and does the right thing by not emitting an extra negate after the 
conditional negate
operation.

Bootstrapped and tested on aarch64, arm, x86_64.

Ok for trunk?

Thanks,
Kyrill


2015-07-16  Kyrylo Tkachov  

* simplify-rtx.c (simplify_unary_operation_1, NEG case):
(neg (x ? (neg y) : y)) -> !x ? (neg y) : y.

2015-07-16  Kyrylo Tkachov  

* gcc.target/aarch64/neg_abs_1.c: New test.
commit 5e8d7d5f431b30edfbc5f92004d5252a1ecfc19d
Author: Kyrylo Tkachov 
Date:   Wed Jul 15 09:19:17 2015 +0100

[simplify-rtx] Simplify - (y ? -x : x) -> (!y ? -x : x)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 91e4b9c..b20dd2d 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -957,6 +957,32 @@ simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op)
   if (GET_CODE (op) == NEG)
 	return XEXP (op, 0);
 
+  /* (neg (x ? (neg y) : y)) == !x ? (neg y) : y.
+	 If comparison is not reversible use
+	 x ? y : (neg y).  */
+  if (GET_CODE (op) == IF_THEN_ELSE)
+	{
+	  rtx cond = XEXP (op, 0);
+	  rtx true_rtx = XEXP (op, 1);
+	  rtx false_rtx = XEXP (op, 2);
+
+	  if ((GET_CODE (true_rtx) == NEG
+	   && rtx_equal_p (XEXP (true_rtx, 0), false_rtx))
+	   || (GET_CODE (false_rtx) == NEG
+		   && rtx_equal_p (XEXP (false_rtx, 0), true_rtx)))
+	{
+	  if (reversed_comparison_code (cond, NULL_RTX) != UNKNOWN)
+		temp = reversed_comparison (cond, mode);
+	  else
+		{
+		  temp = cond;
+		  std::swap (true_rtx, false_rtx);
+		}
+	  return simplify_gen_ternary (IF_THEN_ELSE, mode,
+	mode, temp, true_rtx, false_rtx);
+	}
+	}
+
   /* (neg (plus X 1)) can become (not X).  */
   if (GET_CODE (op) == PLUS
 	  && XEXP (op, 1) == const1_rtx)
diff --git a/gcc/testsuite/gcc.target/aarch64/neg_abs_1.c b/gcc/testsuite/gcc.target/aarch64/neg_abs_1.c
new file mode 100644
index 000..cb2a387
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/neg_abs_1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -O2" } */
+
+int
+f1 (int x)
+{
+  return x < 0 ? x : -x;
+}
+
+long long
+f2 (long long x)
+{
+  return x < 0 ? x : -x;
+}
+
+/* { dg-final { scan-assembler-not "\tneg\tw\[0-9\]*.*" } } */
+/* { dg-final { scan-assembler-not "\tneg\tx\[0-9\]*.*" } } */


[PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov

Hi all,

This is an attempt to solve the problem in the thread starting at
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
in a generic way after some pointers from Segher and Andrew.

The problem I got was that combine_simplify_rtx was trying to
do some special handling of unary operations applied to if_then_else
but ended up exiting early due to:

  enum rtx_code cond_code = simplify_comparison (NE, &cond, &cop1);

  if (cond_code == NE && COMPARISON_P (cond))
return x;

I tried removing that bug that led to regressions in SPEC2006.
The solution that worked for me led to two patches.

In this first patch we add a simplification step to the rtx before trying any 
substitutions.
In the second patch I add the simplify-rtx.c simplification to transform - (y ? 
-x : x)
into (!y ? -x : x) which fixes the testcase I mentioned in the first thread.

This first patch by itself already showed to be an improvement for aarch64 with 
by managing
to eliminate a large amount of redundant zero_extend operations in SPEC2006.
Overall, I saw a 2.8% decrease in [su]xt[bhw] instructions generated for the 
whole of SPEC2006
and no regressions in code quality i.e. no instructions that were combined 
before but not combine
with this patch.

Bootstrapped and tested on arm, aarch64 and x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* combine.c (combine_simplify_rtx): Try to simplify if_then_else
rtxes before trying substitutions.
commit 685bc1a66a36292329f678bae555e9c43e434e5d
Author: Kyrylo Tkachov 
Date:   Thu Jul 9 16:54:23 2015 +0100

[combine] Try to simplify before substituting

diff --git a/gcc/combine.c b/gcc/combine.c
index 574f874..40d2231 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 {
   rtx cond, true_rtx, false_rtx;
 
+  /* If some simplification is possible from the start, try it now.  */
+  temp = simplify_rtx (x);
+
+  if (temp)
+	{
+	  x = temp;
+	  code = GET_CODE (x);
+	  mode = GET_MODE (x);
+	  op0_mode = VOIDmode;
+	}
+
   cond = if_then_else_cond (x, &true_rtx, &false_rtx);
   if (cond != 0
 	  /* If everything is a comparison, what we have is highly unlikely


One more patch to fix PR66626

2015-07-16 Thread Vladimir Makarov

  The following patch fixes PR66626

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66626

  The previous patch solved only one problem (the 2nd test case in the 
PR).  The following patch solves the all test cases.


  The patch was tested and bootstrapped on x86/x86-64.

  Committed as rev. 225891.

2015-07-16  Vladimir Makarov  

PR rtl-optimization/66626
* ira.h (emit-rtl.h): Include.
(non_spilled_static_chain_regno_p): New.
* ira-color.c (setup_profitable_hard_regs): Clear profitable regs
unless it is non spilled static chain pseudo.
(assign_hard_rego): Spill memory profitable allocno unless it is
non spilled static chain pseudo.
(allocno_spill_priority_compare): Put non spilled static chain
pseudo at the end of sorted array.
(improve_allocation): Do nothing if we have static chain and
non-local goto.
(allocno__priority_compare_func): Put non spilled static chain
pseudo at the beginning of sorted array.
(move_spill_restore): Ignore non spilled static chain pseudo.
* ira-costs.c (find_costs_and_classes): Don't assign class NO_REGS
to non spilled static chain pseudo.
* lra-assigns.c (pseudo_compare_func): Put non spilled static chain
pseudo at the beginning of sorted array.
(spill_for): Spill non spilled static chain pseudo last.
* lra-constraints.c (lra_constraints): Remove static chain pseudo
check for equivalence.

2015-07-16  Vladimir Makarov  

PR rtl-optimization/66626
* gcc.target/i386/pr66626-2.c: New.

Index: ira-color.c
===
--- ira-color.c	(revision 225618)
+++ ira-color.c	(working copy)
@@ -1058,7 +1058,10 @@ setup_profitable_hard_regs (void)
 	continue;
   data = ALLOCNO_COLOR_DATA (a);
   if (ALLOCNO_UPDATED_HARD_REG_COSTS (a) == NULL
-	  && ALLOCNO_CLASS_COST (a) > ALLOCNO_MEMORY_COST (a))
+	  && ALLOCNO_CLASS_COST (a) > ALLOCNO_MEMORY_COST (a)
+	  /* Do not empty profitable regs for static chain pointer
+	 pseudo when non-local goto is used.  */
+	  && ! non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a)))
 	CLEAR_HARD_REG_SET (data->profitable_hard_regs);
   else
 	{
@@ -1140,7 +1143,10 @@ setup_profitable_hard_regs (void)
 	  if (! TEST_HARD_REG_BIT (data->profitable_hard_regs,
    hard_regno))
 		continue;
-	  if (ALLOCNO_UPDATED_MEMORY_COST (a) < costs[j])
+	  if (ALLOCNO_UPDATED_MEMORY_COST (a) < costs[j]
+		  /* Do not remove HARD_REGNO for static chain pointer
+		 pseudo when non-local goto is used.  */
+		  && ! non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a)))
 		CLEAR_HARD_REG_BIT (data->profitable_hard_regs,
 hard_regno);
 	  else if (min_cost > costs[j])
@@ -1148,7 +1154,10 @@ setup_profitable_hard_regs (void)
 	}
 	}
   else if (ALLOCNO_UPDATED_MEMORY_COST (a)
-	   < ALLOCNO_UPDATED_CLASS_COST (a))
+	   < ALLOCNO_UPDATED_CLASS_COST (a)
+	   /* Do not empty profitable regs for static chain
+		  pointer pseudo when non-local goto is used.  */
+	   && ! non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a)))
 	CLEAR_HARD_REG_SET (data->profitable_hard_regs);
   if (ALLOCNO_UPDATED_CLASS_COST (a) > min_cost)
 	ALLOCNO_UPDATED_CLASS_COST (a) = min_cost;
@@ -1868,7 +1877,10 @@ assign_hard_reg (ira_allocno_t a, bool r
 	  ira_assert (hard_regno >= 0);
 	}
 }
-  if (min_full_cost > mem_cost)
+  if (min_full_cost > mem_cost
+  /* Do not spill static chain pointer pseudo when non-local goto
+	 is used.  */
+  && ! non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a)))
 {
   if (! retry_p && internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
 	fprintf (ira_dump_file, "(memory is more profitable %d vs %d) ",
@@ -2494,6 +2506,12 @@ allocno_spill_priority_compare (ira_allo
 {
   int pri1, pri2, diff;
 
+  /* Avoid spilling static chain pointer pseudo when non-local goto is
+ used.  */
+  if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a1)))
+return 1;
+  else if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a2)))
+return -1;
   if (ALLOCNO_BAD_SPILL_P (a1) && ! ALLOCNO_BAD_SPILL_P (a2))
 return 1;
   if (ALLOCNO_BAD_SPILL_P (a2) && ! ALLOCNO_BAD_SPILL_P (a1))
@@ -2746,6 +2764,11 @@ improve_allocation (void)
   ira_allocno_t a;
   bitmap_iterator bi;
 
+  /* Don't bother to optimize the code with static chain pointer and
+ non-local goto in order not to spill the chain pointer
+ pseudo.  */
+  if (cfun->static_chain_decl && crtl->has_nonlocal_goto)
+return;
   /* Clear counts used to process conflicting allocnos only once for
  each allocno.  */
   EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi)
@@ -2952,6 +2975,12 @@ allocno_priority_compare_func (const voi
   ira_allocno_t a2 = *(const ira_allocno_t *) v2p;
   int pri1, pri2;
 
+  /* Assign hard reg to static chain pointer pseudo first wh

[PATCH][AArch64][14/14] Reuse target_option_current_node when passing pragma string to target attribute

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch improves compilation times for code using the arm_neon.h intrinsics.
The problem there is that since we now wrap all the intrinsics in arm_neon.h 
inside a pragma,
the midend will apply the pragma string onto every single intrinsic as an 
attribute, calling
the target attribute parsing code thousands of times on the same string.  I've 
seen this cause
slowdown on large intrinsics programs in the area of 3-5%.

This patch checks if the ARGS we're supposed to process is the same as the 
prgma already
processed by the pragma processing code in aarch64-c.c. If it is, then we know 
that the correct
target node is already set in target_option_current_node, so we can just reuse 
that, saving us
the trouble of parsing the string.

This gets compilation times for large intrinsic programs to the previous levels.
We still get a compile-time hit on small programs due to grokdeclarator in the 
frontend
appearing high in the profile due to the pragma use, I presume. But for large 
programs
we should be good.  The compilation time will be dominated by the other parts 
of the compiler.
In any case, for small programs, garbage collection is at the top of the 
profile in either case.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_option_valid_attribute_p):
Exit early and use target_option_current_node if processing current
pragma.
commit 0bbab2ef7fb4be18780b5c87d338d2f9d9116fe4
Author: Kyrylo Tkachov 
Date:   Thu May 28 15:33:49 2015 +0100

[AArch64][14/N] Reuse target_option_current_node when passing pragma string to target attribute

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f0f3cdc..f8c5aa4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8431,6 +8431,18 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
   tree old_optimize;
   tree new_target, new_optimize;
   tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
+
+  /* If what we're processing is the current pragma string then the
+ target option node is already stored in target_option_current_node
+ by aarch64_pragma_target_parse in aarch64-c.c.  Use that to avoid
+ having to re-parse the string.  This is especially useful to keep
+ arm_neon.h compile times down since that header contains a lot
+ of intrinsics enclosed in pragmas.  */
+  if (!existing_target && args == current_target_pragma)
+{
+  DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_option_current_node;
+  return true;
+}
   tree func_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
 
   old_optimize = build_optimization_node (&global_options);


[PATCH][doc][13/14] Document AArch64 target attributes and pragmas

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch adds the documentation for the AArch64 target attributes and pragmas.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* doc/extend.texi (AArch64 Function Attributes): New node.
(AArch64 Pragmas): Likewise.
commit 533fdcf9675b1364d0bf8446601b2509d9987e3a
Author: Kyrylo Tkachov 
Date:   Fri May 22 12:06:10 2015 +0100

[doc][13/N] Document AArch64 target attributes and pragmas

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index bb858a8..e263e7e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2191,6 +2191,7 @@ GCC plugins may provide their own attributes.
 
 @menu
 * Common Function Attributes::
+* AArch64 Function Attributes::
 * ARC Function Attributes::
 * ARM Function Attributes::
 * AVR Function Attributes::
@@ -3322,6 +3323,144 @@ easier to pack regions.
 
 @c This is the end of the target-independent attribute table
 
+@node AArch64 Function Attributes
+@subsection AArch64 Function Attributes
+
+The following target-specific function attributes are available for
+the AArch64 target and for the most part mirror the behavior of similar
+command line options, but on a per-function basis:
+
+@table @code
+@item general-regs-only
+@cindex @code{general-regs-only} function attribute, AArch64
+Indicates that no floating point or AdvancedSIMD registers should be
+used when generating code for this function.  If the function explicitly
+uses floating point code, then the compiler will give an error.  This is
+the same behavior as that of the command line option
+@code{-mgeneral-regs-only}.
+
+@item fix-cortex-a53-835769
+@cindex @code{fix-cortex-a53-835769} function attribute, AArch64
+Indicates that the workaround for the Cortex-A53 erratum 835769 should be
+applied to this function.  To explicitly disable the workaround for this
+function specify the negated form: @code{no-fix-cortex-a53-835769}.
+This corresponds to the behavior of the command line options
+@code{-mfix-cortex-a53-835769} and @code{-mno-fix-cortex-a53-835769}.
+
+@item cmodel=
+@cindex @code{cmodel=} function attribute, AArch64
+Indicates that code should be generated for a particular code model for
+this function.  The behaviour and permissible arguments are the same as
+for the command line option @code{-mcmodel=}.
+
+@item strict-align
+@cindex @code{strit-align} function attribute, AArch64
+Indicates that the compiler should not assume that unaligned memory references
+are handled by the system.  The behavior is the same as for the command-line
+option @code{-mstrict-align}.
+
+@item omit-leaf-frame-pointer
+@cindex @code{omit-leaf-frame-pointer} function attribute, AArch64
+Indicates that the frame pointer should be omitted for a leaf function call.
+To keep the frame pointer, the inverse attribute
+@code{no-omit-leaf-frame-pointer} can be specified.  These attributes have
+the same behavior as the command-line options @code{-momit-leaf-frame-pointer}
+and @code{-mno-omit-leaf-frame-pointer}.
+
+@item tls-dialect=
+@cindex @code{tls-dialect=} function attribute, AArch64
+Specifies the TLS dialect to use for this function.  The behavior and
+permissible arguments are the same as for the command-line option
+@code{-mtls-dialect=}.
+
+@item arch=
+@cindex @code{arch=} function attribute, AArch64
+Specifies the architecture version and architectural extensions to use
+for this function.  The behavior and permissible arguments are the same as
+for the @code{-march=} command-line option.
+
+@item tune=
+@cindex @code{tune=} function attribute, AArch64
+Specifies the core for which to tune the performance of this function.
+The behavior and permissible arguments are the same as for the @code{-mtune=}
+command-line option.
+
+@item cpu=
+@cindex @code{cpu=} function attribute, AArch64
+Specifies the core for which to tune the performance of this function and also
+whose architectural features to use.  The behavior and valid arguments are the
+same as for the @code{-mcpu=} command-line option.
+
+@end table
+
+The above target attributes can be specified as follows:
+
+@smallexample
+__attribute__((target("")))
+int
+f (int a)
+@{
+  return a + 5;
+@}
+@end smallexample
+
+where @code{} is one of the attribute strings specified above.
+
+Additionally, the architectural extension string may be specified on its
+own.  This can be used to turn on and off particular architectural extensions
+without having to specify a particular architecture version or core.  Example:
+
+@smallexample
+__attribute__((target("+crc+nocrypto")))
+int
+foo (int a)
+@{
+  return a + 5;
+@}
+@end smallexample
+
+In this example @code{target("+crc+nocrypto")} will enable the @code{crc}
+extension and disable the @code{crypto} extension for the function @code{foo}
+without modifying an existing @code{-march=} or @code{-mcpu} option.
+
+Multiple target function attributes can be specified by separating them with
+a comma.  For example:
+@smallexample
+__attribute__((target("arch=armv8-a+crc+crypto,tune=co

[PATCH][AArch64][11/14] Re-layout SIMD builtin types on builtin expansion

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch fixes an ICE that I encountered while testing the series.
The testcase in the patch ICEs during builtin expansion because the testcase
is compiled with +nofp which means the builtin SIMD types are laid out
according to the nofp rules, but later when a function tagged with +simd
tries to use them assuming they are laid out for SIMD, the ICE occurs.

I've struggled for some time to find a good fix for that.
This is the best I could come up with. During expansion time we take
the decl of the thing being passed to the builtin function and re-lay it.
The majority (all?) of uses of these builtins are only within the intrinsics in 
arm_neon.h anyway.
This fixes the ICE and doesn't have a negative impact on compile time (not that 
I could measure, anyway)

This patch also initializes the crc intrinsics unconditionally to handle the 
case where a user may compile
a file with +nocrc and then have a function with +crc using an intrinsic.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_option_valid_attribute_p):
Initialize simd builtins if TARGET_SIMD.
* config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins):
Make sure that the builtins are initialized only once no matter how
many times the function is called.
(aarch64_init_builtins): Unconditionally initialize crc builtins.
(aarch64_relayout_simd_param): New function.
(aarch64_simd_expand_args): Use above during argument expansion.
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Initialize
simd builtins if TARGET_SIMD.
* config/aarch64/aarch64-protos.h (aarch64_init_simd_builtins): New
prototype.
(aarch64_relayout_simd_types): Likewise.

2015-07-16  Kyrylo Tkachov  

* gcc.target/aarch64/target-attr-crypto-ice-1.c: New test.
commit 07191e8bbcd3ecbd14d19f0a4296249ba6c2770f
Author: Kyrylo Tkachov 
Date:   Wed May 20 12:02:33 2015 +0100

[AArch64][11/N] Re-layout SIMD builtin types on builtin expansion

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 294bf9d..df63ea8 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -555,7 +555,7 @@ aarch64_simd_builtin_type (enum machine_mode mode,
   else
 return aarch64_lookup_simd_builtin_type (mode, qualifier_none);
 }
- 
+
 static void
 aarch64_init_simd_builtin_types (void)
 {
@@ -679,11 +679,18 @@ aarch64_init_simd_builtin_scalar_types (void)
 	 "__builtin_aarch64_simd_udi");
 }
 
-static void
+static bool simd_builtins_inited_p = false;
+
+void
 aarch64_init_simd_builtins (void)
 {
   unsigned int i, fcode = AARCH64_SIMD_PATTERN_START;
 
+  if (simd_builtins_inited_p)
+return;
+
+  simd_builtins_inited_p = true;
+
   aarch64_init_simd_builtin_types ();
 
   /* Strong-typing hasn't been implemented for all AdvSIMD builtin intrinsics.
@@ -846,8 +853,8 @@ aarch64_init_builtins (void)
 
   if (TARGET_SIMD)
 aarch64_init_simd_builtins ();
-  if (TARGET_CRC32)
-aarch64_init_crc32_builtins ();
+
+  aarch64_init_crc32_builtins ();
 }
 
 tree
@@ -867,6 +874,31 @@ typedef enum
   SIMD_ARG_STOP
 } builtin_simd_arg;
 
+/* Relayout the decl of a function arg.  Keep the RTL component the same,
+   as varasm.c ICEs at varasm.c:1324.  It doesn't like reinitializing the RTL
+   on PARM decls.  Something like this needs to be done when compiling a
+   file without SIMD and then tagging a function with +simd and using SIMD
+   intrinsics in there.  The types will have been laid out assuming no SIMD,
+   so we want to re-lay them out.  */
+
+static void
+aarch64_relayout_simd_param (tree arg)
+{
+  tree argdecl = arg;
+  if (TREE_CODE (argdecl) == SSA_NAME)
+argdecl = SSA_NAME_VAR (argdecl);
+
+  if (argdecl
+  && (TREE_CODE (argdecl) == PARM_DECL
+	  || TREE_CODE (argdecl) == VAR_DECL))
+{
+  rtx rtl = NULL_RTX;
+  rtl = DECL_RTL_IF_SET (argdecl);
+  relayout_decl (argdecl);
+  SET_DECL_RTL (argdecl, rtl);
+}
+}
+
 static rtx
 aarch64_simd_expand_args (rtx target, int icode, int have_retval,
 			  tree exp, builtin_simd_arg *args)
@@ -895,6 +927,7 @@ aarch64_simd_expand_args (rtx target, int icode, int have_retval,
 	{
 	  tree arg = CALL_EXPR_ARG (exp, opc - have_retval);
 	  enum machine_mode mode = insn_data[icode].operand[opc].mode;
+	  aarch64_relayout_simd_param (arg);
 	  op[opc] = expand_normal (arg);
 
 	  switch (thisarg)
diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index c3798a1..ecc9974 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -179,6 +179,19 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
 
   cpp_opts->warn_unused_macros = saved_warn_unused_macros;
 
+  /* Initialize SIMD builtins if we haven't already.
+ Set current_target_pragma to NULL for the duration so that
+ the builtin initialization code doesn't t

[PATCH][AArch64][12/14] Target attributes and target pragmas tests

2015-07-16 Thread Kyrill Tkachov

Hi all,

These are the tests for target attributes and pragmas.
I've tried to test for the inlining rules, some of the possible errors and the 
preprocessor macros changed from target pragmas.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* gcc.target/aarch64/pragma-cpp-predefs-1.c: New test.
* gcc.target/aarch64/target-attr-1.c: Likewise.
* gcc.target/aarch64/target-attr-2.c: Likewise.
* gcc.target/aarch64/target-attr-3.c: Likewise.
* gcc.target/aarch64/target-attr-4.c: Likewise.
* gcc.target/aarch64/target-attr-5.c: Likewise.
* gcc.target/aarch64/target-attr-6.c: Likewise.
* gcc.target/aarch64/target-attr-7.c: Likewise.
* gcc.target/aarch64/target-attr-8.c: Likewise.
* gcc.target/aarch64/target-attr-9.c: Likewise.
* gcc.target/aarch64/target-attr-10.c: Likewise.
* gcc.target/aarch64/target-attr-11.c: Likewise.
* gcc.target/aarch64/target-attr-12.c: Likewise.
* gcc.target/aarch64/target-attr-13.c: Likewise.
* gcc.target/aarch64/target-attr-14.c: Likewise.
commit 1218b6fe4dd6d3429793d62369a878047a9f9a35
Author: Kyrylo Tkachov 
Date:   Thu May 21 15:21:44 2015 +0100

[AArch64][12/N] Target attributes and target pragmas tests

diff --git a/gcc/testsuite/gcc.target/aarch64/pragma-cpp-predefs-1.c b/gcc/testsuite/gcc.target/aarch64/pragma-cpp-predefs-1.c
new file mode 100644
index 000..779220e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pragma-cpp-predefs-1.c
@@ -0,0 +1,255 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv8-a+crypto" } */
+
+/* Test  that pragma option pushing and popping works.
+   Also that CPP predefines redefinitions on #pragma works.  */
+
+#pragma GCC push_options
+#pragma GCC target("arch=armv8-a+nofp+nosimd")
+#ifdef __ARM_FEATURE_FMA
+#error "__ARM_FEATURE_FMA defined but shouldn't!"
+#endif
+
+#ifdef __ARM_FP
+#error "__ARM_FP defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("arch=armv8-a+fp+nosimd")
+#ifndef __ARM_FP
+#error "__ARM_FP is not defined but should!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("arch=armv8-a+fp+simd")
+
+#ifndef __ARM_NEON
+#error "__ARM_NEON not defined but should!"
+#endif
+
+#ifdef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("arch=armv8-a+fp+simd+crypto")
+
+#ifndef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO not defined but should!"
+#endif
+
+#pragma GCC pop_options
+
+#ifndef __ARM_NEON
+#error "__ARM_NEON not defined but should!"
+#endif
+
+#ifdef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO defined but shouldn't!"
+#endif
+
+
+#pragma GCC pop_options
+
+#ifndef __ARM_FP
+#error "__ARM_FP is not defined but should!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+#pragma GCC pop_options
+
+#ifdef __ARM_FP
+#error "__ARM_FP is defined but shouldn't!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+/* And again, but using cpu=.  */
+
+#pragma GCC push_options
+#pragma GCC target("cpu=cortex-a53+nofp+nosimd")
+#ifdef __ARM_FEATURE_FMA
+#error "__ARM_FEATURE_FMA defined but shouldn't!"
+#endif
+
+#ifdef __ARM_FP
+#error "__ARM_FP defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("cpu=cortex-a53+fp+nosimd")
+#ifndef __ARM_FP
+#error "__ARM_FP is not defined but should!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("cpu=cortex-a53+fp+simd+nocrypto")
+
+#ifndef __ARM_NEON
+#error "__ARM_NEON not defined but should!"
+#endif
+
+#ifdef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("cpu=cortex-a53+fp+simd+crypto")
+
+#ifndef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO not defined but should!"
+#endif
+
+
+#pragma GCC pop_options
+
+#ifndef __ARM_NEON
+#error "__ARM_NEON not defined but should!"
+#endif
+
+#ifdef __ARM_FEATURE_CRYPTO
+#error "__ARM_FEATURE_CRYPTO defined but shouldn't!"
+#endif
+
+
+#pragma GCC pop_options
+
+#ifndef __ARM_FP
+#error "__ARM_FP is not defined but should!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+#pragma GCC pop_options
+
+#ifdef __ARM_FP
+#error "__ARM_FP is defined but shouldn't!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON defined but shouldn't!"
+#endif
+
+/* And again, but using just the isa extensions.  */
+
+#pragma GCC push_options
+#pragma GCC target("+nofp")
+#ifdef __ARM_FEATURE_FMA
+#error "__ARM_FEATURE_FMA defined but shouldn't!"
+#endif
+
+#ifdef __ARM_FP
+#error "__ARM_FP defined but shouldn't!"
+#endif
+
+#pragma GCC push_options
+#pragma GCC target("+fp+nosimd")
+#ifndef __ARM_FP
+#error "__ARM_FP is not defined but should!"
+#endif
+
+#ifdef __ARM_NEON
+#error "__ARM_NEON de

[PATCH][AArch64][10/14] Implement target pragmas

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch implements target pragmas for aarch64.
The pragmas accepted are the same as for target attributes (as required).
In addition pragmas will need to redefine the target-specific preprocessor
macros if appropriate.

A new file: aarch64-c.c is added and the code from TARGET_CPU_CPP_BUILTINS is 
moved there
and split up into the unconditional parts that are always defined and the 
conditional stuff
that depends on certain architectural features.  The pragma processing code 
calls that
to redefine preprocessor macros on the fly.
The implementation is similar to the rs6000 one.

With target pragmas implemented, we can use them in the arm_neon.h and 
arm_acle.h headers to
specify the architectural features required for those intrinsics, rather than 
#ifdef'ing them
out when FP/SIMD is not available from the command line.

We need to do this in order to handle cases where the user compiles a file with 
-mgeneral-regs-only
but has a function tagged with +simd and tries to use the arm_neon.h intrinsics.
Tests and documentation comes as a separate patch later on in the series

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config.gcc (aarch64*-*-*): Specify c_target_objs and cxx_target_objs.
* config/aarch64/aarch64.h (REGISTER_TARGET_PRAGMAS):
(TARGET_CPU_CPP_BUILTINS): Redefine to call aarch64_cpu_cpp_builtins.
* config/aarch64/aarch64.c (aarch64_override_options_internal): Remove
static keyword.
(aarch64_reset_previous_fndecl): New function.
* config/aarch64/aarch64-c.c: New file.
* config/aarch64/arm_acle.h: Add pragma +crc+nofp at the top.
Push and pop options at beginning and end.  Remove ifdef
__ARM_FEATURE_CRC32.
* config/aarch64/arm_neon.h: Remove #ifdef check on __ARM_NEON.
Add pragma arch=armv8-a+simd and +crypto where appropriate.
* config/aarch64/t-aarch64 (aarch64-c.o): New rule.

2015-07-16  Kyrylo Tkachov  

* gcc.target/aarch64/arm_neon-nosimd-error.c: Delete.
commit 62979865acc0a1c832882cbb8871e6860efce620
Author: Kyrylo Tkachov 
Date:   Thu May 14 15:36:07 2015 +0100

[AArch64][10/N] Implement target pragmas

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 900aa18..5da8442 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -302,6 +302,8 @@ m32c*-*-*)
 aarch64*-*-*)
 	cpu_type=aarch64
 	extra_headers="arm_neon.h arm_acle.h"
+c_target_objs="aarch64-c.o"
+cxx_target_objs="aarch64-c.o"
 	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
 	target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
 	target_has_targetm_common=yes
diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
new file mode 100644
index 000..c3798a1
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -0,0 +1,192 @@
+/* Target-specific code for C family languages.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "input.h"
+#include "tm_p.h"
+#include "flags.h"
+#include "c-family/c-common.h"
+#include "cpplib.h"
+#include "c-family/c-pragma.h"
+#include "langhooks.h"
+#include "target.h"
+
+
+#define builtin_define(TXT) cpp_define (pfile, TXT)
+#define builtin_assert(TXT) cpp_assert (pfile, TXT)
+
+
+static void
+aarch64_def_or_undef (bool def_p, const char *macro, cpp_reader *pfile)
+{
+  if (def_p)
+cpp_define (pfile, macro);
+  else
+cpp_undef (pfile, macro);
+}
+
+/* Define the macros that we always expect to have on AArch64.  */
+static void
+aarch64_define_unconditional_macros (cpp_reader *pfile)
+{
+  builtin_define ("__aarch64__");
+  builtin_define ("__ARM_64BIT_STATE");
+
+  builtin_define ("__ARM_ARCH_ISA_A64");
+  builtin_define_with_int_value ("__ARM_ALIGN_MAX_PWR", 28);
+  builtin_define_with_int_value ("__ARM_ALIGN_MAX_STACK_PWR", 16);
+
+  /* __ARM_ARCH_8A is not mandated by ACLE but we define it unconditionally
+ as interoperability with the same arm macro.  */
+  builtin_define ("__ARM_ARCH_8A");
+
+  builtin_define_with_int_value ("__ARM_ARCH_PROFILE", 'A');
+  builtin_define ("__ARM_FEATURE_CLZ");
+  builtin_define ("__ARM_FEATURE_IDIV");
+  builtin_define ("__ARM_FEATURE_UNALIGNED");
+  builtin_define ("__ARM_PCS_AAPCS64");
+  builtin_d

[PATCH][AArch64][8/14] Implement TARGET_OPTION_VALID_ATTRIBUTE_P

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch implements target attribute support via the 
TARGET_OPTION_VALID_ATTRIBUTE_P hook.
The aarch64_handle_option function in common/config/aarch64/aarch64-common.c is 
exported to the
backend and beefed up a bit.

The target attributes supported by this patch reflect the command-line options 
that we specified as Save
earlier in the series.  Explicitly, the target attributes supported are:
 - "general-regs-only"
 - "fix-cortex-a53-835769" and "no-fix-cortex-a53-835769"
 - "cmodel="
 - "strict-align"
 - "omit-leaf-frame-pointer" and "no-omit-leaf-frame-pointer"
 - "tls-dialect"
 - "arch="
 - "cpu="
 - "tune="

These correspond to equivalent command-line options when prefixed with a '-m'.
Additionally, this patch supports specifying architectural features, as in the 
-march and -mcpu options
by themselves. So, for example we can write:
__attribute__((target("+simd+crypto")))
to enable 'simd' and 'crypto' on a per-function basis.

The documentation and tests for this come as a separate patch later after the 
target pragma support and
the inlining rules are implemented.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* common/config/aarch64/aarch64-common.c (aarch64_handle_option):
Remove static.  Handle OPT_mgeneral_regs_only,
OPT_mfix_cortex_a53_835769, OPT_mstrict_align,
OPT_momit_leaf_frame_pointer.
* config/aarch64/aarch64.c: Include opts.h and diagnostic.h
(aarch64_attr_opt_type): New enum.
(aarch64_attribute_info): New struct.
(aarch64_handle_attr_arch): New function.
(aarch64_handle_attr_cpu): Likewise.
(aarch64_handle_attr_tune): Likewise.
(aarch64_handle_attr_isa_flags): Likewise.
(aarch64_attributes): New table.
(aarch64_process_one_target_attr): New function.
(num_occurences_in_str): Likewise.
(aarch64_process_target_attr): Likewise.
(aarch64_option_valid_attribute_p): Likewise.
(TARGET_OPTION_VALID_ATTRIBUTE_P): Define.
* config/aarch64/aarch64-protos.h: Include input.h
(aarch64_handle_option): Declare prototype.
commit 61d61f318cfa12a7cb05c10829de7b147e3238ce
Author: Kyrylo Tkachov 
Date:   Fri May 8 12:06:24 2015 +0100

[AArch64][8/N] Implement TARGET_OPTION_VALID_ATTRIBUTE_P

diff --git a/gcc/common/config/aarch64/aarch64-common.c b/gcc/common/config/aarch64/aarch64-common.c
index b3fd9dc..726c625 100644
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -60,7 +60,7 @@ static const struct default_options aarch_option_optimization_table[] =
respective component of -mcpu.  This logic is implemented
in config/aarch64/aarch64.c:aarch64_override_options.  */
 
-static bool
+bool
 aarch64_handle_option (struct gcc_options *opts,
 		   struct gcc_options *opts_set ATTRIBUTE_UNUSED,
 		   const struct cl_decoded_option *decoded,
@@ -68,6 +68,7 @@ aarch64_handle_option (struct gcc_options *opts,
 {
   size_t code = decoded->opt_index;
   const char *arg = decoded->arg;
+  int val = decoded->value;
 
   switch (code)
 {
@@ -83,6 +84,22 @@ aarch64_handle_option (struct gcc_options *opts,
   opts->x_aarch64_tune_string = arg;
   return true;
 
+case OPT_mgeneral_regs_only:
+  opts->x_target_flags |= MASK_GENERAL_REGS_ONLY;
+  return true;
+
+case OPT_mfix_cortex_a53_835769:
+  opts->x_aarch64_fix_a53_err835769 = val;
+  return true;
+
+case OPT_mstrict_align:
+  opts->x_target_flags |= MASK_STRICT_ALIGN;
+  return true;
+
+case OPT_momit_leaf_frame_pointer:
+  opts->x_flag_omit_frame_pointer = val;
+  return true;
+
 default:
   return true;
 }
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index fc1cec7..3a5482d 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -22,6 +22,8 @@
 #ifndef GCC_AARCH64_PROTOS_H
 #define GCC_AARCH64_PROTOS_H
 
+#include "input.h"
+
 /*
   SYMBOL_CONTEXT_ADR
   The symbol is used in a load-address operation.
@@ -376,6 +378,8 @@ extern bool aarch64_madd_needs_nop (rtx_insn *);
 extern void aarch64_final_prescan_insn (rtx_insn *);
 extern bool
 aarch64_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
+bool aarch64_handle_option (struct gcc_options *, struct gcc_options *,
+			 const struct cl_decoded_option *, location_t);
 void aarch64_atomic_assign_expand_fenv (tree *, tree *, tree *);
 int aarch64_ccmp_mode_to_code (enum machine_mode mode);
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ff87631..0a6ed70 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -57,6 +57,8 @@
 #include "tm_p.h"
 #include "recog.h"
 #include "langhooks.h"
+#include "opts.h"
+#include "diagnostic.h"
 #include "diagnostic-core.h"
 #include "internal-fn.h"
 #include "gimple-fold.h"
@@ -7981,6 +7983,509 @@ aarch64_set_current_function (tree fndecl)
 }
 }
 
+/* E

[PATCH][AArch64][7/14] Implement TARGET_SET_CURRENT_FUNCTION

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch implements TARGET_SET_CURRENT_FUNCTION and defines SWITCHABLE_TARGET.
With this patch in the series, we should be far enough to get LTO option 
switching to work properly.

The implementation if TARGET_SET_CURRENT_FUNCTION is pretty much a direct copy 
from the rs6000 backend,
and i386 has a very similar structure as well.  I tried to simplify this for 
aarch64, but in the end
this implementation was the one that worked.

TARGET_SET_CURRENT_FUNCTION should take the target-specific options from 
DECL_FUNCTION_SPECIFIC_TARGET
and use them to set up the backend state.  Since it may be called many times 
for the same function,
we keep track of the previous function this got called on in order to avoid 
repeating work.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.h (SWITCHABLE_TARGET): Define.
* config/aarch64/aarch64.c: Include target-globals.h
(aarch64_previous_fndecl): New variable.
(aarch64_set_current_function): New function.
(TARGET_SET_CURRENT_FUNCTION): Define.
commit f3c5c9df8bb5a26a7df65e7b68d9eb6f60eccb40
Author: Kyrylo Tkachov 
Date:   Thu May 7 14:06:04 2015 +0100

[AArch64][7/N] Implement TARGET_SET_CURRENT_FUNCTION

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2891690..ff87631 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -73,6 +73,7 @@
 #include "tm-constrs.h"
 #include "sched-int.h"
 #include "cortex-a57-fma-steering.h"
+#include "target-globals.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -7927,6 +7928,58 @@ aarch64_option_print (FILE *file, int indent, struct cl_target_option *ptr)
   aarch64_print_extension (file, isa_flags);
 }
 
+static GTY(()) tree aarch64_previous_fndecl;
+
+/* Implement TARGET_SET_CURRENT_FUNCTION.  Unpack the codegen decisions
+   like tuning and ISA features from the DECL_FUNCTION_SPECIFIC_TARGET
+   of the function, if such exists.  This function may be called multiple
+   times on a single function so use aarch64_previous_fndecl to avoid
+   setting up identical state.  */
+
+static void
+aarch64_set_current_function (tree fndecl)
+{
+  tree old_tree = (aarch64_previous_fndecl
+		   ? DECL_FUNCTION_SPECIFIC_TARGET (aarch64_previous_fndecl)
+		   : NULL_TREE);
+
+  tree new_tree = (fndecl
+		   ? DECL_FUNCTION_SPECIFIC_TARGET (fndecl)
+		   : NULL_TREE);
+
+
+  if (fndecl && fndecl != aarch64_previous_fndecl)
+{
+  aarch64_previous_fndecl = fndecl;
+  if (old_tree == new_tree)
+	;
+
+  else if (new_tree && new_tree != target_option_default_node)
+	{
+	  cl_target_option_restore (&global_options,
+TREE_TARGET_OPTION (new_tree));
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else
+	TREE_TARGET_GLOBALS (new_tree)
+	  = save_target_globals_default_opts ();
+	}
+
+  else if (old_tree && old_tree != target_option_default_node)
+	{
+	  new_tree = target_option_current_node;
+	  cl_target_option_restore (&global_options,
+TREE_TARGET_OPTION (new_tree));
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else if (new_tree == target_option_default_node)
+	restore_target_globals (&default_target_globals);
+	  else
+	TREE_TARGET_GLOBALS (new_tree)
+	  = save_target_globals_default_opts ();
+	}
+}
+}
 
 /* Return true if SYMBOL_REF X binds locally.  */
 
@@ -12399,6 +12452,9 @@ aarch64_unspec_may_trap_p (const_rtx x, unsigned flags)
 #undef TARGET_OPTION_PRINT
 #define TARGET_OPTION_PRINT aarch64_option_print
 
+#undef TARGET_SET_CURRENT_FUNCTION
+#define TARGET_SET_CURRENT_FUNCTION aarch64_set_current_function
+
 #undef TARGET_PASS_BY_REFERENCE
 #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 6d792c4..2c1b6ce 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -920,6 +920,9 @@ do {	 \
 #define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \
 		(FP_REGNUM_P (REGNO) && GET_MODE_SIZE (MODE) > 8)
 
+#undef SWITCHABLE_TARGET
+#define SWITCHABLE_TARGET 1
+
 /* Check TLS Descriptors mechanism is selected.  */
 #define TARGET_TLS_DESC (aarch64_tls_dialect == TLS_DESCRIPTORS)
 


[PATCH][AArch64][9/14] Implement TARGET_CAN_INLINE_P

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch implements the target-specific inlining rules.
The basic philosophy is that we want to definitely reject inlining if the 
callee's architecture
is not a subset, feature-wise, of the caller's.

Beyond that, we want to allow inlining if the callee is always_inline.
If it's not, we reject inlining if the TargetSave options don't match up
in a way that's described in the comments in the patch.

Generally, we try to allow as much inlining as possible for the benefit of LTO.
However, if the architectural features of the callee are not a subset of the 
features
of the caller, then we must reject inlining. For example, inlining a function 
with 'simd'
into a function without 'simd' is not allowed.

Also, inlining a non-strict-align function into a strict-align function is not 
allowed.
These two restrictions apply even when the callee is tagged with always_inline 
because they
can affect the correctness of the program.

Beyond that, we reject inlining only if the user has explicitly specified 
attributes/options for
both the caller and the callee and they don't match up.

An exception to that are the tuning CPUs. We want to allow inlining even when 
the tuning CPUs don't match.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_reject_inlining): New function.
(aarch64_can_inline_p): Likewise.
(TARGET_CAN_INLINE_P): Define.
commit 55e576c7ca01679551404dac9f6302443c2a6bae
Author: Kyrylo Tkachov 
Date:   Thu May 14 12:00:07 2015 +0100

[AArch64][9/N] Implement TARGET_CAN_INLINE_P

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0a6ed70..34cd986 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8486,6 +8486,113 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
   return ret;
 }
 
+/* Helper for aarch64_can_inline_p.  In the case where CALLER and CALLEE are
+   tri-bool options (yes, no, don't care) and the default value is
+   DEF, determine whether to reject inlining.  */
+
+static bool
+aarch64_reject_inlining (int caller, int callee, int dont_care, int def)
+{
+  /* If both caller and callee care about the value then reject inlining
+ if they don't match up.  */
+  if (caller != dont_care && callee != dont_care && caller != callee)
+return true;
+
+  /* If caller doesn't care then make sure that the default agrees
+ with the callee.  */
+  if (caller == dont_care && callee != dont_care && callee != def)
+return true;
+
+  return false;
+}
+
+/* Implement TARGET_CAN_INLINE_P.  Decide whether it is valid
+   to inline CALLE into CALLER based on target-specific info.
+   Make sure that the caller and callee have compatible architectural
+   features.  Then go through the other possible target attributes
+   and.  Try not to reject always_inline callees unless they are
+   incompatible architecturally.  */
+
+static bool
+aarch64_can_inline_p (tree caller, tree callee)
+{
+  bool ret = false;
+  tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
+  tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
+
+  /* If callee has no option attributes, then it is ok to inline.  */
+  if (!callee_tree)
+ret = true;
+  else
+{
+  struct cl_target_option *caller_opts
+	= TREE_TARGET_OPTION (caller_tree ? caller_tree
+	   : target_option_default_node);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+
+
+  /* Callee's ISA flags should be a subset of the caller's.  */
+  if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags)
+	  == callee_opts->x_aarch64_isa_flags)
+	ret = true;
+
+ /* Allow non-strict aligned functions inlining into strict
+aligned ones.  */
+  if ((TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)
+	   != TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags))
+	  && !(!TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags)
+	   && TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)))
+	ret = false;
+
+  bool always_inline = lookup_attribute ("always_inline",
+	   DECL_ATTRIBUTES (callee));
+
+  /* If the architectural features match up and the callee is always_inline
+	  then the other attributes don't matter.  */
+  if (always_inline)
+	return ret;
+
+  if (caller_opts->x_aarch64_cmodel_var
+	  != callee_opts->x_aarch64_cmodel_var)
+	ret = false;
+
+  if (caller_opts->x_aarch64_tls_dialect
+	  != callee_opts->x_aarch64_tls_dialect)
+	ret = false;
+
+
+  /* Honour explicit requests to workaround errata.  */
+  if (aarch64_reject_inlining (caller_opts->x_aarch64_fix_a53_err835769,
+callee_opts->x_aarch64_fix_a53_err835769,
+2, TARGET_FIX_ERR_A53_835769_DEFAULT))
+	ret = false;
+
+  /* If the user explicitly specified -momit-leaf-frame-pointer for the
+ caller and calle and they don't match up, reject inlining.  */
+  if (aarch6

[PATCH][AArch64][3/14] Refactor option override code

2015-07-16 Thread Kyrill Tkachov

Hi all,

This one is more meaty than the previous ones. It buffs up the parsing 
functions for
the mcpu, march, mtune options, decouples them and makes them return an enum 
describing
the errors that may occur.  This will allow us to use these functions in other 
contexts
beyond aarch64_override_options.

aarch64_override_options itself gets an overhaul and is split up into code that 
must run
only once after the command line option have been processed, and code that has 
to be run
every time the backend-specific state changes (after SWITCHABLE_TARGET is 
implemented).

The stuff that must be run every time the backend state changes is put into
aarch64_override_options_internal.

Also, this patch deletes the declarations of aarch64_{arch,cpu,tune}_string 
from aarch64.opt
as they are superfluous since the march, mtune and mcpu option specification 
implicitly
declares these variables.

This patch looks large, but a lot of it is moving code around...

Bootstrapped and tested as part of the series on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.opt (aarch64_arch_string): Delete.
(aarch64_cpu_string): Likewise.
(aarch64_tune_string): Likewise.
* config/aarch64/aarch64.c (aarch64_parse_opt_result): New enum.
(aarch64_parse_extension): Return aarch64_parse_opt_result.
Add extra argument to put result into.
(aarch64_parse_arch): Likewise.  Do not set selected_cpu.
(aarch64_parse_cpu): Add arguments to put results into. Return
aarch64_parse_opt_result.
(aarch64_parse_tune): Likewise.
(aarch64_override_options_after_change_1): New function.
(aarch64_override_options_internal): New function.
(aarch64_validate_mcpu): Likewise.
(aarch64_validate_march): Likewise.
(aarch64_validate_mtune): Likewise.
(aarch64_override_options): Update to reflect above changes.
Move some logic into aarch64_override_options_internal.
Initialize target_option_default_node and target_option_current_node.
(aarch64_override_options_after_change): Move logic into
aarch64_override_options_after_change_1 and call it with global_options.
(initialize_aarch64_code_model): Take a gcc_options pointer and use the
flag values from that.

2015-07-06  Kyrylo Tkachov  

* gcc.target/aarch64/cpu-diagnostics-3.c: Update expected error string.
commit cea498a32becf7cd947864ff908f25fe447c89d2
Author: Kyrylo Tkachov 
Date:   Wed May 6 16:27:00 2015 +0100

[AArch64][3/N] Refactor option override code

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 32b974a..5ea65e3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7101,12 +7101,27 @@ aarch64_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind,
   return retval;
 }
 
-static void initialize_aarch64_code_model (void);
+static void initialize_aarch64_code_model (struct gcc_options *);
 
-/* Parse the architecture extension string.  */
+/* Enum describing the various ways that the
+   aarch64_parse_{arch,tune,cpu,extension} functions can fail.
+   This way their callers can choose what kind of error to give.  */
 
-static void
-aarch64_parse_extension (char *str)
+enum aarch64_parse_opt_result
+{
+  AARCH64_PARSE_OK,			/* Parsing was successful.  */
+  AARCH64_PARSE_MISSING_ARG,		/* Missing argument.  */
+  AARCH64_PARSE_INVALID_FEATURE,	/* Invalid feature modifier.  */
+  AARCH64_PARSE_INVALID_ARG		/* Invalid arch, tune, cpu arg.  */
+};
+
+
+/* Parse the architecture extension string STR and update ISA_FLAGS
+   with the architecture features turned on or off.  Return a
+   aarch64_parse_opt_result describing the result.  */
+
+static enum aarch64_parse_opt_result
+aarch64_parse_extension (char *str, unsigned long *isa_flags)
 {
   /* The extension string is parsed left to right.  */
   const struct aarch64_option_extension *opt = NULL;
@@ -7137,11 +7152,8 @@ aarch64_parse_extension (char *str)
 	adding_ext = 1;
 
   if (len == 0)
-	{
-	  error ("missing feature modifier after %qs", adding_ext ? "+"
-	  : "+no");
-	  return;
-	}
+	return AARCH64_PARSE_MISSING_ARG;
+
 
   /* Scan over the extensions table trying to find an exact match.  */
   for (opt = all_extensions; opt->name != NULL; opt++)
@@ -7150,9 +7162,9 @@ aarch64_parse_extension (char *str)
 	{
 	  /* Add or remove the extension.  */
 	  if (adding_ext)
-		aarch64_isa_flags |= opt->flags_on;
+		*isa_flags |= opt->flags_on;
 	  else
-		aarch64_isa_flags &= ~(opt->flags_off);
+		*isa_flags &= ~(opt->flags_off);
 	  break;
 	}
 	}
@@ -7160,27 +7172,30 @@ aarch64_parse_extension (char *str)
   if (opt->name == NULL)
 	{
 	  /* Extension not found in list.  */
-	  error ("unknown feature modifier %qs", str);
-	  return;
+	  return AARCH64_PARSE_INVALID_FEATURE;
 	}
 
   str = ext;
 };
 
-  return;
+  return AARCH64_PARSE_OK;
 }
 
-/* P

[PATCH][AArch64][6/14] Implement TARGET_OPTION_SAVE/TARGET_OPTION_RESTORE

2015-07-16 Thread Kyrill Tkachov

Hi all,

This is one of the main patches in the series.
The backend compilation state can be described by the options in aarch64.opt 
marked as Save.
This causes the options-save.c machinery to save and restore them when asked 
them and the
TARGET_OPTION_SAVE and TARGET_OPTION_RESTORE should handle all the extra stuff 
that's required
to reinitialise the backend.

This patch marks the options that we want to support for SWITCHABLE_TARGET as 
Save and adds 3
extra variables: explicit_tune_core, explicit_arch and x_aarch64_isa_flags.
These 3 variables are used to store the explicit core to tune for (as specified 
by -mcpu or -mtune),
the explicitly specified architecture (as specified by -mcpu or -march) and the 
architecture
features (as specified by the extension string to -march,-mcpu or derived from 
them).

The aarch64_isa_flags definition is moved from aarch64.c into aarch64.opt and 
marked as a TargetVariable.
This means that the auto-generated machinery in options-save.c will 
automatically save and restore it for us.

The patch defines the TARGET_OPTION_RESTORE hook to extract the selected_tune 
and selected_arch from the
explicit_tune_core and explicit_arch variables and restore the backend 
compilation state using the
aarch64_override_options_internal machinery that we refactored earlier.

A TARGET_OPTION_PRINT implementation is added to print out the explicit_arch 
and explicit_tune_core options,
as well as aarch64_isa_flags.

As preparation for SWITCHABLE_TARGETS this patch also changes the output 
assembly format a bit.
Since we want to potentially handle multiple values of aarch64_isa_flags within 
a file in the future, we don't
want to just print out a global .arch or .cpu directive in the beginning of the 
assembly file.
Instead, we want to print out the .arch directive on a per-function basis. This 
is accomplished by
defining the ASM_DECLARE_FUNCTION_NAME hook and printing out selected_arch and 
aarch64_isa_flags there.
As an added bonus we can print out the tuning name in the comments and since we 
added a proper ident
field to the processor struct that we store in explicit_tune_core, we can print 
out the full tune name
in an assembly comment.

For example, compiling with -mcpu=cortex-a57.cortex-a53 we now get:

.file   "sha1_1.c"
.text
.align  2
.p2align 4,,15
.global foo
.arch armv8-a+fp+simd+crc
//.tune cortex-a57.cortex-a53
.type   foo, %function
foo:
add w0, w0, 5
ret
.size   foo, .-foo
.ident  "GCC: (unknown) 6.0.0 20150522 (experimental)"

instead of:
.cpu cortex-a57+fp+simd+crc
.file   "sha1_1.c"
.text
.align  2
.p2align 4,,15
.global foo
.type   foo, %function
foo:
add w0, w0, 5
ret
.size   foo, .-foo
.ident  "GCC: (fsf-trunk.670) 6.0.0 20150416 (experimental)"

Consequently, TARGET_ASM_FILE_START is deleted.


Bootstrapped and tested on aarch64.
Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.opt (explicit_tune_core): New TargetVariable.
(explicit_arch): Likewise.
(x_aarch64_isa_flags): Likewise.
(mgeneral-regs-only): Mark as Save.
(mfix-cortex-a53-835769): Likewise.
(mcmodel=): Likewise.
(mstrict-align): Likewise.
(momit-leaf-frame-pointer): Likewise.
(mtls-dialect): Likewise.
(master=): Likewise.
* config/aarch64/aarch64.h (ASM_DECLARE_FUNCTION_NAME): Define.
(aarch64_isa_flags): Remove extern declaration.
* config/aarch64/aarch64.c (aarch64_validate_mcpu): Return a bool
to indicate success or failure.
(aarch64_validate_march): Likewise.
(aarch64_validate_mtune): Likewise.
(aarch64_isa_flags): Delete.
(aarch64_override_options_internal): Access opts->x_aarch64_isa_flags
instead of aarch64_isa_flags.
(aarch64_get_tune_cpu): New function.
(aarch64_get_arch): Likewise.
(aarch64_override_options): Use above and set up explicit_tune_core
and explicit_arch.
(aarch64_print_extension): Move earlier in file.  Add isa_flags
argument and use that instead of the global aarch64_isa_flags.
(aarch64_option_restore): Likewise.
(aarch64_option_print): Likewise.
(aarch64_declare_function_name): Likewise.
(aarch64_start_file): Delete.
(TARGET_ASM_FILE_START): Do not define.
(TARGET_OPTION_RESTORE, TARGET_OPTION_PRINT): Define.
* config/aarch64/aarch64-protos.h (aarch64_declare_function_name):
Declare prototype.
commit 89c785e5fe0a57483f62fdd84a8b5b4e365c1b95
Author: Kyrylo Tkachov 
Date:   Thu May 7 12:07:51 2015 +0100

[AArch64][6/N] Implement TARGET_OPTION_SAVE/TARGET_OPTION_RESTORE

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index e4f5b00..fc1cec7 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -255,6 +255,7 @@ bool aarch64_gimple_fold_builtin

[PATCH][AArch64][4/14] Create TARGET_FIX_ERR_A53_835769 and use that instead of aarch64_fix_a53_err835769

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch transforms the Cortex-A53 erratum 835769 workaround checks into a 
macro.
This way we don't have to override aarch64_fix_a53_err835769 in the default case
and this allows us to keep track of when the user doesn't specify this option,
which may come in handy later on when we decide the inlining rules.

This patch also makes TARGET_FIX_ERR_A53_835769_DEFAULT unconditionally defined 
to
0 or 1, so that we don't have to check it if #ifdefs.

Bootstrapped and tested as part of series on aarch64.
Checked that the workaround is applied as previously.

Ok for trunk?

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.h (TARGET_FIX_ERR_A53_835769_DEFAULT): Always
define to 0 or 1.
(TARGET_FIX_ERR_A53_835769): New macro.
* config/aarch64/aarch64.c (aarch64_override_options_internal): Remove
handling of opts->x_aarch64_fix_a53_err835769.
(aarch64_madd_needs_nop): Check for TARGET_FIX_ERR_A53_835769 rather
than aarch64_fix_a53_err835769.
* config/aarch64/aarch64-elf-raw.h: Update for above changes.
* config/aarch64/aarch64-linux.h: Likewise.
commit 12e50e9fdcb86b0a4c73b3b43d92c386e9504637
Author: Kyrylo Tkachov 
Date:   Thu May 21 09:49:12 2015 +0100

[AArch64][4/N] Create TARGET_FIX_ERR_A53_835769 and use that instead of aarch64_fix_a53_err835769

diff --git a/gcc/config/aarch64/aarch64-elf-raw.h b/gcc/config/aarch64/aarch64-elf-raw.h
index bd5e51c..66b4c8b 100644
--- a/gcc/config/aarch64/aarch64-elf-raw.h
+++ b/gcc/config/aarch64/aarch64-elf-raw.h
@@ -27,7 +27,7 @@
   " crtend%O%s crtn%O%s " \
   "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}"
 
-#ifdef TARGET_FIX_ERR_A53_835769_DEFAULT
+#if TARGET_FIX_ERR_A53_835769_DEFAULT
 #define CA53_ERR_835769_SPEC \
   " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
 #else
diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h
index 1600a32..b9d7805 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -52,7 +52,7 @@
   " %{mfix-cortex-a53-835769:--fix-cortex-a53-835769}"
 #endif
 
-#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
+#if TARGET_FIX_ERR_A53_843419_DEFAULT
 #define CA53_ERR_843419_SPEC \
   " %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}"
 #else
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5ea65e3..aff23d6 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7552,15 +7552,6 @@ aarch64_override_options_internal (struct gcc_options *opts)
   if (opts->x_flag_strict_volatile_bitfields < 0 && abi_version_at_least (2))
 opts->x_flag_strict_volatile_bitfields = 1;
 
-  if (opts->x_aarch64_fix_a53_err835769 == 2)
-{
-#ifdef TARGET_FIX_ERR_A53_835769_DEFAULT
-  opts->x_aarch64_fix_a53_err835769 = 1;
-#else
-  opts->x_aarch64_fix_a53_err835769 = 0;
-#endif
-}
-
   /* -mgeneral-regs-only sets a mask in target_flags, make sure that
  aarch64_isa_flags does not contain the FP/SIMD/Crypto feature flags
  in case some code tries reading aarch64_isa_flags directly to check if
@@ -9004,7 +8995,7 @@ aarch64_madd_needs_nop (rtx_insn* insn)
   rtx_insn *prev;
   rtx body;
 
-  if (!aarch64_fix_a53_err835769)
+  if (!TARGET_FIX_ERR_A53_835769)
 return false;
 
   if (recog_memoized (insn) < 0)
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 2a097af..d2d1ebf 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -233,6 +233,20 @@ extern unsigned long aarch64_isa_flags;
 /* CRC instructions that can be enabled through +crc arch extension.  */
 #define TARGET_CRC32 (AARCH64_ISA_CRC)
 
+/* Make sure this is always defined so we don't have to check for ifdefs
+   but rather use normal ifs.  */
+#ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
+#define TARGET_FIX_ERR_A53_835769_DEFAULT 0
+#else
+#undef TARGET_FIX_ERR_A53_835769_DEFAULT
+#define TARGET_FIX_ERR_A53_835769_DEFAULT 1
+#endif
+
+/* Apply the workaround for Cortex-A53 erratum 835769.  */
+#define TARGET_FIX_ERR_A53_835769	\
+  ((aarch64_fix_a53_err835769 == 2)	\
+  ? TARGET_FIX_ERR_A53_835769_DEFAULT : aarch64_fix_a53_err835769)
+
 /* Standard register usage.  */
 
 /* 31 64-bit general purpose registers R0-R30:


[PATCH][AArch64][5/14] Make flag_omit_leaf_frame_pointer intialize to 2. Define and use TARGET_OMIT_LEAF_FRAME_POINTER

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch wraps aarch64_frame_pointer_required into a 
TARGET_OMIT_LEAF_FRAME_POINTER macro
and initializes aarch64_frame_pointer_required to 2 instead of 1, allowing us 
to detect from
aarch64_frame_pointer_required whether the user explicitly specified 
-momit-leaf-frame-pointer
or -mno-omit-leaf-frame-pointer. No functional changes in this patch.

Bootstrapped and tested as part of series on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.opt (momit-leaf-frame-pointer): Initialize
flag_omit_leaf_frame_pointer to 2.
* config/aarch64/aarch64.h (TARGET_OMIT_LEAF_FRAME_POINTER): New macro.
* config/aarch64.aarch64.c (aarch64_frame_pointer_required): Use above.
(aarch64_can_eliminate): Likewise.
commit fa05c48a4ede4f29583b129fa213c42aa2da3a73
Author: Kyrylo Tkachov 
Date:   Thu May 21 09:57:23 2015 +0100

[AArch64][5/N] Make flag_omit_leaf_frame_pointer intialize to 2. Define and use TARGET_OMIT_LEAF_FRAME_POINTER

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index aff23d6..bb404ac 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2205,7 +2205,7 @@ aarch64_frame_pointer_required (void)
  flag_omit_leaf_frame_pointer turns off the frame pointer by
  default.  Turn it back on now if we've not got a leaf
  function.  */
-  if (flag_omit_leaf_frame_pointer
+  if (TARGET_OMIT_LEAF_FRAME_POINTER
   && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))
 return true;
 
@@ -4981,7 +4981,7 @@ aarch64_can_eliminate (const int from, const int to)
 	 LR in the function, then we'll want a frame pointer after all, so
 	 prevent this elimination to ensure a frame pointer is used.  */
   if (to == STACK_POINTER_REGNUM
-	  && flag_omit_leaf_frame_pointer
+	  && TARGET_OMIT_LEAF_FRAME_POINTER
 	  && df_regs_ever_live_p (LR_REGNUM))
 	return false;
 }
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index d2d1ebf..e91541a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -247,6 +247,9 @@ extern unsigned long aarch64_isa_flags;
   ((aarch64_fix_a53_err835769 == 2)	\
   ? TARGET_FIX_ERR_A53_835769_DEFAULT : aarch64_fix_a53_err835769)
 
+/* Omit frame pointer in leaf functions.  */
+#define TARGET_OMIT_LEAF_FRAME_POINTER (flag_omit_leaf_frame_pointer != 0)
+
 /* Standard register usage.  */
 
 /* 31 64-bit general purpose registers R0-R30:
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index c9c0aff..e29d606 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -77,7 +77,7 @@ Target Report RejectNegative Mask(STRICT_ALIGN)
 Don't assume that unaligned accesses are handled by the system
 
 momit-leaf-frame-pointer
-Target Report Save Var(flag_omit_leaf_frame_pointer) Init(1)
+Target Report Save Var(flag_omit_leaf_frame_pointer) Init(2)
 Omit the frame pointer in leaf functions
 
 mtls-dialect=


[PATCH][AArch64][2/14] Refactor arches handling, add arch enum identifier

2015-07-16 Thread Kyrill Tkachov

Hi all,

In this second patch I want to get to the point where I can get an enum that I 
can use
to index all_architectures to get the current architecture being used, similar 
to what we
do in patch 1/N.

The closest thing to what I want in aarch64-arches.def is the 3rd field which 
specifies the
architecture revision. Unfortunately, it is used sometimes as an integer and 
sometimes as a string
when defining the __ARM_ARCH macro in TARGET_CPU_CPP_BUILTINS.

I've decided to create a new field that is to be used as part of an enum name 
to uniquely identify
each entry in aarch64-arches.def. The revision number (currently only '8') is 
left there since we
need it for the ACLE predefs, but we might consider moving that out in the 
future...

In any case, with this patch we can now get an enum that can be used to access 
the architecture
information from all_architectures and can be easily saved and restored for 
SWITCHABLE_TARGET functionality.

Bootstrapped with and without LTO and tested on aarch64 as part of the series.

Ok for trunk?

Thanks,
Kyrill

P.S. I think we should consider creating a separate struct definition for cores 
and architectures as
the information we want to store about each starts to diverge and it's 
sometimes confusing as to what
a 'struct processor*' pointer is referencing. But such a refactoring would 
interfere too much with what
I'm trying to do in this patch series and is not strictly required for it. 
Although, once the dust settles
on this series, I believe it will be easier to split them up.

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Define
__ARM_ARCH_8A directly rather than with cpp_define_formatted.
* config/aarch64/aarch64.c (struct processor): Add arch field.
(all_architectures): Handle above, move above all_cores.
(all_cores): Handle above.
(aarch64_parse_arch): Handle above changes.
* config/aarch64/aarch64-arches.def (armv8-a): Extend according to
above.  Update comments.
(armv8.1-a): Likewise.
* config/aarch64/aarch64-cores.def: Update according to above.
* config/aarch64/aarch64-opts.h (aarch64_arch): New enum.
* config/aarch64/driver-aarch64.c (struct aarch64_arch): Rename to
aarch64_arch_driver_info.
commit ea24da31afa938379e1e679de581360b05f4e0f5
Author: Kyrylo Tkachov 
Date:   Mon May 11 12:09:34 2015 +0100

[AArch64][2/N] Refactor arches handling, add arch enum identifier

diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index abbfce6..3b4fb73 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -19,12 +19,17 @@
 
 /* Before using #include to read this file, define a macro:
 
-  AARCH64_ARCH(NAME, CORE, ARCH, FLAGS)
+  AARCH64_ARCH(NAME, CORE, ARCH_IDENT, ARCH_REV, FLAGS)
 
The NAME is the name of the architecture, represented as a string
constant.  The CORE is the identifier for a core representative of
-   this architecture.  ARCH is the architecture revision.  FLAGS are
-   the flags implied by the architecture.  */
+   this architecture.  ARCH_IDENT is the architecture identifier.  It must be
+   unique and be syntactically valid to appear as part of an enum identifier.
+   ARCH_REV is an integer specifying the architecture major revision.
+   FLAGS are the flags implied by the architecture.
+   Due to the assumptions about the positions of these fields in config.gcc,
+   the NAME should be kept as the first argument and FLAGS as the last.  */
+
+AARCH64_ARCH("armv8-a",	  generic,	 8A,	8,  AARCH64_FL_FOR_ARCH8)
+AARCH64_ARCH("armv8.1-a", generic,	 8_1A,	8,  AARCH64_FL_FOR_ARCH8_1)
 
-AARCH64_ARCH("armv8-a",	  generic,	 8,  AARCH64_FL_FOR_ARCH8)
-AARCH64_ARCH("armv8.1-a", generic,	 8,  AARCH64_FL_FOR_ARCH8_1)
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index c4e22fe..0ab1ca8 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -21,13 +21,14 @@
 
Before using #include to read this file, define a macro:
 
-  AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, COSTS, IMP, PART)
+  AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH_IDENT, FLAGS, COSTS, IMP, PART)
 
The CORE_NAME is the name of the core, represented as a string constant.
The CORE_IDENT is the name of the core, represented as an identifier.
The SCHEDULER_IDENT is the name of the core for which scheduling decisions
will be made, represented as an identifier.
-   ARCH is the architecture revision implemented by the chip.
+   ARCH_IDENT is the architecture implemented by the chip as specified in
+   aarch64-arches.def.
FLAGS are the bitwise-or of the traits that apply to that core.
This need not include flags implied by the architecture.
COSTS is the name of the rtx_costs routine to use.
@@ -39,14 +40,15 @@
 
 /* V8 Architect

[PATCH][AArch64][1/14] Add ident field to struct processor

2015-07-16 Thread Kyrill Tkachov

Hi all,

This first patch adds a field to the processor structure that uniquely 
identifies that processor.
Note that the current 'core' field is actually just the core for which to 
schedule the instructions.
With this patch we get the nice property that we can reference a processor 
struct by just indexing
the all_cores at the index specified by the value of the 'ident' enum.
It's not hard to implement either, since we already construct the required enum 
values in
aarch64-opts.h and aarch64-cores.def already specifies the correct values for 
each core!

Thus, to implement the 'back up and restore' functionality we need for 
SWITCHABLE_TARGET the only thing we'd need
to save and restore on the tuning side is an aarch64_processor enum value.

Bootstrapped with and without LTO and tested on aarch64 as part of series.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  

* config/aarch64/aarch64.c (struct processor): Add ident field.
Rename core sched_core.
(all_cores): Handle above changes.
(all_architectures): Likewise.
(aarch64_parse_arch): Likewise.
(aarch64_override_options): Likewise.
commit 458da15dab42b6bf0e668be78230989694d7973d
Author: Kyrylo Tkachov 
Date:   Tue May 12 09:36:28 2015 +0100

[AArch64][1/N] Add ident field to struct processor

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 020f63c..75e0d70 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -498,7 +498,8 @@ aarch64_tuning_override_functions[] =
 struct processor
 {
   const char *const name;
-  enum aarch64_processor core;
+  enum aarch64_processor ident;
+  enum aarch64_processor sched_core;
   const char *arch;
   unsigned architecture_version;
   const unsigned long flags;
@@ -509,21 +510,22 @@ struct processor
 static const struct processor all_cores[] =
 {
 #define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART) \
-  {NAME, SCHED, #ARCH, ARCH, FLAGS, &COSTS##_tunings},
+  {NAME, IDENT, SCHED, #ARCH, ARCH, FLAGS, &COSTS##_tunings},
 #include "aarch64-cores.def"
 #undef AARCH64_CORE
-  {"generic", cortexa53, "8", 8, AARCH64_FL_FOR_ARCH8, &generic_tunings},
-  {NULL, aarch64_none, NULL, 0, 0, NULL}
+  {"generic", generic, cortexa53, "8", 8,
+   AARCH64_FL_FOR_ARCH8, &generic_tunings},
+  {NULL, aarch64_none, aarch64_none, NULL, 0, 0, NULL}
 };
 
 /* Architectures implementing AArch64.  */
 static const struct processor all_architectures[] =
 {
 #define AARCH64_ARCH(NAME, CORE, ARCH, FLAGS) \
-  {NAME, CORE, #ARCH, ARCH, FLAGS, NULL},
+  {NAME, CORE, CORE, #ARCH, ARCH, FLAGS, NULL},
 #include "aarch64-arches.def"
 #undef AARCH64_ARCH
-  {NULL, aarch64_none, NULL, 0, 0, NULL}
+  {NULL, aarch64_none, aarch64_none, NULL, 0, 0, NULL}
 };
 
 /* Target specification.  These are populated as commandline arguments
@@ -7199,7 +7201,7 @@ aarch64_parse_arch (void)
 	  aarch64_isa_flags = selected_arch->flags;
 
 	  if (!selected_cpu)
-	selected_cpu = &all_cores[selected_arch->core];
+	selected_cpu = &all_cores[selected_arch->ident];
 
 	  if (ext != NULL)
 	{
@@ -7524,7 +7526,7 @@ aarch64_override_options (void)
 selected_tune = selected_cpu;
 
   aarch64_tune_flags = selected_tune->flags;
-  aarch64_tune = selected_tune->core;
+  aarch64_tune = selected_tune->sched_core;
   /* Make a copy of the tuning parameters attached to the core, which
  we may later overwrite.  */
   aarch64_tune_params = *(selected_tune->tune);


[PATCH][AArch64][0/14] Implement SWITCHABLE_TARGET, target attribute and pragma support

2015-07-16 Thread Kyrill Tkachov

Hi all,

This series implements the hooks required to enable SWITCHABLE_TARGET for the 
aarch64 port.
This series allows the aarch64 compiler to sanely handle LTO compilation of 
files compiled
with different target-specific options.

The first 5 patches refactor the option handling machinery and remove some 
clunky global state.
Then there are 2 patches to implement the target-switching hooks and enable 
SWITCHABLE_TARGET.

These are followed by patches to implement some target attributes that may be 
of use to users,
allowing them to set per-function cpu tuning and other options.
This is followed by implementation of target pragmas, tests and documentation 
and a couple of
patches fixing bugs that I encountered along the way.

Look at patch 13 for the documentation of the target attributes and pragmas 
that are made available
through this series.

These have been in my tree for about a month and have been bootstrapped and 
tested on aarch64
multiple times. LTO bootstraps works fine as well.

Ok for trunk?

Thanks,
Kyrill

[AArch64][1/14] Add ident field to struct processor
[AArch64][2/14] Refactor arches handling, add arch enum identifier
[AArch64][3/14] Refactor option override code
[AArch64][4/14] Create TARGET_FIX_ERR_A53_835769 and use that instead of 
aarch64_fix_a53_err835769
[AArch64][5/14] Make flag_omit_leaf_frame_pointer intialize to 2. Define and 
use TARGET_OMIT_LEAF_FRAME_POINTER
[AArch64][6/14] Implement TARGET_OPTION_SAVE/TARGET_OPTION_RESTORE
[AArch64][7/14] Implement TARGET_SET_CURRENT_FUNCTION
[AArch64][8/14] Implement TARGET_OPTION_VALID_ATTRIBUTE_P
[AArch64][9/14] Implement TARGET_CAN_INLINE_P
[AArch64][10/14] Implement target pragmas
[AArch64][11/14] Re-layout SIMD builtin types on builtin expansion
[AArch64][12/14] Target attributes and target pragmas tests
[doc][13/14] Document AArch64 target attributes and pragmas
[AArch64][14/14] Reuse target_option_current_node when passing pragma string to 
target attribute



Re: [Bug fortran/52846] [F2008] Support submodules - part 2/3 - redux

2015-07-16 Thread Paul Richard Thomas
Dear Reinhold, dear all,

Please find attached a new version of the patch that fixes the
inconsistency with the standard, pointed out by Reinhold. It is weird
but a read the appropriate part of the standard several times and
simply did not pick up the critical information :-)

Note that the delimiter used for submodule file name is '@', whereas
the internal identifiers is '.'.

I have added a procedure to cleanup submodules produced by the
testsuite and implemented them in submodule_[1-8].f90. Submodule_8.f90
tests the resolution of the spurious error found by Reinhold.

Booststraps and regtests on FC_21/x86_64 - OK for trunk?

Paul

2015-07-16  Paul Thomas  

PR fortran/52846
* decl.c (gfc_match_end): Pick out declared submodule name from
the composite identifier.
* gfortran.h : Add 'submodule_name' to gfc_use_list structure.
* module.c (gfc_match_submodule): Define submodule_name and add
static 'submodule_name'.
(gfc_match_submodule): Build up submodule filenames, using '@'
as a delimiter. Store the output filename in 'submodule_name'.
Similarly, the submodule identifier is built using '.' as an
identifier.
(gfc_dump_module): If current state is COMP_SUBMODULE, write
to file 'submodule_name', using SUBMODULE_EXTENSION.
(gfc_use_module): Similarly, use the 'submodule_name' field in
the gfc_use_list structure and SUBMODULE_EXTENSION to read the
implicitly used submodule files.

2015-07-16  Paul Thomas  

PR fortran/52846
* lib/fortran-modules.exp (proc cleanup-submodules): New proc..
* gfortran.dg/submodule_1.f90: Clean up submodules
* gfortran.dg/submodule_2.f90: Clean up submodules
* gfortran.dg/submodule_3.f90: Clean up submodules
* gfortran.dg/submodule_4.f90: Clean up submodules
* gfortran.dg/submodule_5.f90: Clean up submodules
* gfortran.dg/submodule_6.f90: Clean up submodules
* gfortran.dg/submodule_7.f90: Clean up submodules
* gfortran.dg/submodule_8.f90: New test






On 14 July 2015 at 13:10, Paul Richard Thomas
 wrote:
> Dear All,
>
> Reinhold Bader has pointed out the naming the submodule files after
> the submodule name and using .mod as the extension can potentially
> lead to clashes. Therefore, I have written a patch to change gfortran
> to follow the naming convention of another leading brand:
>
> submodule filename = module@ancestor@@submodule.smod
>
> The implementation is straightforward and the ChangeLog and the patch
> provide an adequate description.
>
> Bootstraps and regtests on x86_64 - OK for trunk?
>
> Paul
>
> 2015-07-14  Paul Thomas  
>
> PR fortran/52846
> * gfortran.h : Add 'submodule_name' to gfc_use_list structure.
> * module.c (gfc_match_submodule): Define submodule_name and add
> static 'submodule_name'.
> (gfc_match_submodule): Build up submodule filenames, using '@'
> as a delimiter. Store the output filename in 'submodule_name'.
> (gfc_dump_module): If current state is COMP_SUBMODULE, write
> to file 'submodule_name', using SUBMODULE_EXTENSION.
> (gfc_use_module): Similarly, use the 'submodule_name' field in
> the gfc_use_list structure and SUBMODULE_EXTENSION to read the
> implicitly used submodule files.



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 225410)
--- gcc/fortran/decl.c  (working copy)
*** gfc_match_end (gfc_statement *st)
*** 6451,6456 
--- 6451,6461 
if (block_name == NULL)
  goto syntax;

+   /* We have to pick out the declared submodule name from the composite
+  required by F2008:11.2.3 para 2, which ends in the declared name.  */
+   if (state == COMP_SUBMODULE)
+ block_name = strchr (block_name, '.') + 1;
+
if (strcmp (name, block_name) != 0 && strcmp (block_name, "ppr@") != 0)
  {
gfc_error ("Expected label %qs for %s statement at %C", block_name,
Index: gcc/fortran/gfortran.h
===
*** gcc/fortran/gfortran.h  (revision 225410)
--- gcc/fortran/gfortran.h  (working copy)
*** gfc_use_rename;
*** 1556,1561 
--- 1556,1562 
  typedef struct gfc_use_list
  {
const char *module_name;
+   const char *submodule_name;
bool intrinsic;
bool non_intrinsic;
bool only_flag;
Index: gcc/fortran/module.c
===
*** gcc/fortran/module.c(revision 225410)
--- gcc/fortran/module.c(working copy)
*** along with GCC; see the file COPYING3.
*** 82,87 
--- 82,88 
  #include 

  #define MODULE_EXTENSION ".mod"
+ #define SUBMODULE_EXTENSION ".smod"

  /* Don't put any single quote (') in MOD_VERSION, if you want it to be
 recognized.  */
*** static gzFile module_fp;
*** 191,196

Re: [gomp4.1] C++ iterators with #omp ordered depend(sink:)

2015-07-16 Thread Jakub Jelinek
On Thu, Jul 16, 2015 at 07:57:56AM -0700, Aldy Hernandez wrote:
> @@ -13828,6 +13828,14 @@ tsubst_omp_for_iterator (tree t, int i, tree declv, 
> tree initv,
>  
>init = TREE_VEC_ELT (OMP_FOR_INIT (t), i);
>gcc_assert (TREE_CODE (init) == MODIFY_EXPR);
> +
> +  if (orig_declv && OMP_FOR_ORIG_DECLS (t))
> +{
> +  tree o = TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (t), i);
> +  o = RECUR (o);
> +  TREE_VEC_ELT (orig_declv, i) = o;

Please fold the above two lines into:
TREE_VEC_ELT (orig_declv, i) = RECUR (o);
No need to retest.

Ok with that change, thanks.

Jakub


Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-16 Thread Richard Earnshaw
On 16/07/15 16:00, Michael Matz wrote:
> Hi,
> 
> On Thu, 16 Jul 2015, Richard Earnshaw wrote:
> 
> Now that we do have the problem, we can't fix it without an ARM port 
> ABI change, which is undesirable, so we may have to fix it with a MI 
> change.

 What's the ABI implication of fixing the inconsistency?
>>>
>>
>> I think that's the wrong question.  We wouldn't change the ABI to fix an 
>> internal problem in GCC.  So the real question is what's the performance 
>> impact of changing PROMOTE_MODE to be the same as the ABI requirements?
> 
> Perhaps, I really only wanted to get a feeling what type of changes 
> in code generation would result with the flip.  I wonder why this ABI 
> implication was no problem back when PROMOTE_MODE and 
> target.promote_function_mode were seperated and the inconsistency 
> introduced.
> 
> 
> Ciao,
> Michael.
> 

Promote_function_mode requirements were new with the EABI.  However, I
think it's probably only recent changes in the mid-end that have exposed
the problem.

R.


Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE

2015-07-16 Thread Michael Matz
Hi,

On Thu, 16 Jul 2015, Richard Earnshaw wrote:

> >>> Now that we do have the problem, we can't fix it without an ARM port 
> >>> ABI change, which is undesirable, so we may have to fix it with a MI 
> >>> change.
> >>
> >> What's the ABI implication of fixing the inconsistency?
> > 
> 
> I think that's the wrong question.  We wouldn't change the ABI to fix an 
> internal problem in GCC.  So the real question is what's the performance 
> impact of changing PROMOTE_MODE to be the same as the ABI requirements?

Perhaps, I really only wanted to get a feeling what type of changes 
in code generation would result with the flip.  I wonder why this ABI 
implication was no problem back when PROMOTE_MODE and 
target.promote_function_mode were seperated and the inconsistency 
introduced.


Ciao,
Michael.


Re: [gomp4.1] C++ iterators with #omp ordered depend(sink:)

2015-07-16 Thread Aldy Hernandez

On 07/16/2015 12:06 AM, Jakub Jelinek wrote:

Hi!

CCing Jason on a tsubst issue below.

On Wed, Jul 15, 2015 at 06:05:06PM -0700, Aldy Hernandez wrote:

As I said on IRC, this looks ugly, but I can see why you wouldn't want the
extra word on all loop variants.  I've implemented it as requested.


Thanks.


commit f55eced4ac6b045101a90914a8f27e99d26cfddf
Author: Aldy Hernandez 
Date:   Tue Jul 14 19:23:09 2015 -0700

* gimplify.c (gimplify_omp_for): Use OMP_FOR_ORIG_DECLS.
* tree.def (omp_for): Add new operand.
* tree.h (OMP_FOR_ORIG_DECLS): New macro.
 c-family/
* c-common.h (c_finish_omp_for): Add argument.
* c-omp.c (c_finish_omp_for): Set OMP_FOR_ORIG_DECLS.
 cp/
* cp-tree.h (finish_omp_for): Add new argument.
* parser.c (cp_parser_omp_for_loop): Pass new argument.
* pt.c (tsubst_omp_for_iterator): New argument orig_declv.
Set OMP_FOR_ORIG_DECLS from orig_declv if available.
(tsubst_expr): Pass new vector to tsubst_omp_for_iterator.
* semantics.c (finish_omp_for): Pass original DECLs to
c_finish_omp_for.
Set OMP_FOR_ORIG_DECLS.
 c/
* c-parser.c (c_parser_omp_for_loop): Pass new argument to
c_finish_omp_for.


Can you please add the
testsuite/
additions to ChangeLog too?


@@ -13828,6 +13828,16 @@ tsubst_omp_for_iterator (tree t, int i, tree declv, 
tree initv,

init = TREE_VEC_ELT (OMP_FOR_INIT (t), i);
gcc_assert (TREE_CODE (init) == MODIFY_EXPR);
+
+  if (orig_declv && OMP_FOR_ORIG_DECLS (t))
+{
+  tree o = TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (t), i);
+  tree spec = retrieve_local_specialization (o);
+  if (spec)
+   o = spec;


Why doesn't o = RECUR (o); work here?  Or tsubst_copy?
 From my experience tsubst_expr can result in (here undesirable) 
convert_from_reference
if the decl has type of e.g. template parameter and you instantiate with
some reference type.  But, that can only happen if the decl is dependent,
see below:


Well, apparently I was covering something that fixing the dependent 
issue fixed.





@@ -14468,6 +14479,8 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
if (OMP_FOR_INIT (t) != NULL_TREE)
  {
declv = make_tree_vec (TREE_VEC_LENGTH (OMP_FOR_INIT (t)));
+   if (TREE_CODE (t) == OMP_FOR)
+ orig_declv = make_tree_vec (TREE_VEC_LENGTH (OMP_FOR_INIT (t)));


I'd have expected to guard this also with
if (TREE_CODE (t) == OMP_FOR && OMP_FOR_ORIG_DECLS (t) != NULL_TREE)


@@ -7302,6 +7303,9 @@ finish_omp_for (location_t locus, enum tree_code code, 
tree declv, tree initv,
TREE_VEC_ELT (initv, i) = init;
  }

+  if (!orig_declv || !TREE_VEC_LENGTH (orig_declv))
+orig_declv = copy_node (declv);


When is TREE_VEC_LENGTH 0?  Furthermore, I'd do this only after the
if (dependent_omp_for_p ())


fixed.




+
if (dependent_omp_for_p (declv, initv, condv, incrv))
  {
tree stmt;
@@ -7325,6 +7329,8 @@ finish_omp_for (location_t locus, enum tree_code code, 
tree declv, tree initv,
OMP_FOR_BODY (stmt) = body;
OMP_FOR_PRE_BODY (stmt) = pre_body;
OMP_FOR_CLAUSES (stmt) = clauses;
+  if (code == OMP_FOR)
+   OMP_FOR_ORIG_DECLS (stmt) = orig_declv;


And leave this out.  If something is dependent, we don't really modify it
before instantiation, thus there is no need to make a copy.  During
instantiation another finish_omp_for will be called, then declv, initv,
condv, incrv won't be dependent and we can orig_declv = copy_node (declv).


fixed.




diff --git a/gcc/testsuite/c-c++-common/gomp/sink-2.c 
b/gcc/testsuite/c-c++-common/gomp/sink-2.c
new file mode 100644
index 000..7a075d4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/sink-2.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+void bar (int *);
+
+void
+foo ()
+{
+  int i,j;
+#pragma omp parallel for ordered(1)
+  for (i=0; i < 100; ++i)
+{
+#pragma omp ordered depend(sink:i-1)
+bar(&i);


Can you please add #pragma omp ordered depend(source) below the bar call?


sure.

Retested on x86-64 Linux.

OK for branch?


commit 9d6a83abf8759b72b75f2690eab33f8b4d3fc199
Author: Aldy Hernandez 
Date:   Tue Jul 14 19:23:09 2015 -0700

* gimplify.c (gimplify_omp_for): Use OMP_FOR_ORIG_DECLS.
* tree.def (omp_for): Add new operand.
* tree.h (OMP_FOR_ORIG_DECLS): New macro.
c-family/
* c-common.h (c_finish_omp_for): Add argument.
* c-omp.c (c_finish_omp_for): Set OMP_FOR_ORIG_DECLS.
cp/
* cp-tree.h (finish_omp_for): Add new argument.
* parser.c (cp_parser_omp_for_loop): Pass new argument.
* pt.c (tsubst_omp_for_iterator): New argument orig_declv.
Set OMP_FOR_ORIG_DECLS from orig_declv if available.
(tsubst_expr): Pass new vector to tsubst_omp_for_iterator.
* semantics.c (finish_omp_for): Pass original DECLs to
c_finish_omp_for.
 

Re: [PATCH] Add 'switch' statement to match.pd language

2015-07-16 Thread Michael Matz
Hi,

On Thu, 16 Jul 2015, Richard Biener wrote:

> > Similar, if the condition is an atom you should be able to leave the 
> > parens away:
> > 
> >  (switch
> >   cond (minus @0 @1)
> >  )
> > 
> > (given a predicate 'cond' defined appropriately).
> 
> Yes.  Though techincally the condition cannot be an atom because
> it has to be a c-expr and that has no notion of atom vs. no-atom.

"1" is a valid c-expr, and quite atomy :)  (Or "true")

> But the issue is to unambiguously parse the else clause, thus

Ah, yes, I remember, the c-expr vs expr case; the parser is too limited :)  
In that case I find the extra keyword without parens even better:

(switch
 when (bla) (foo)
 when (bar) (boo)
 (blob))

I.e. following 'when' it's an c-expr (when single token, parens optional), 
when not following 'when' its a result expr (atomic or not).  Think of it 
as an infix multi-part keyword (like smalltalk has multi-part method 
names), the keyword(s) being "switch(when)*".

I'm undecided if I'd allow function calls as atoms as well (because they 
contain parens), like so:

(switch
 when integer_zero(@0) @1
 when integer_zero(@1) @0
 (plus @0 @1))

This would mean that there would be no single-token conditions without 
parens where one could leave out outer parens, as otherwise you have a 
ambiguity between:

(switch
 when true (@0)// (@0) is the result
 ...)

and

(switch
 when token(@0) @1// (@0) belongs to the when-expr
 ...)

One has to chose between one or the other, and I think the latter (i.e. 
function calls as lone when-expr) occur more often.

(Limiting the number of parens is worthwhile IMHO, but you probably 
guessed that much already :))


Ciao,
Michael.


Re: [Fortran, Patch] Passing function pointer to co_reduce

2015-07-16 Thread Damian Rouson


> On Jul 15, 2015, at 8:58 AM, Mikael Morin  wrote:
> 
> The patch itself looks good to me.
> A ChangeLog entry should be provided with it.
> The test is missing the usual dejagnu pattern matching directives to
> check the generated code.
> Do you have commit rights?

Hi Mikael,

Thanks for the quick review.   I know Alessandro is not a reviewer. Assuming 
commit rights go with review rights, I’m guessing he doesn’t have commit 
rights.  I’ll let him confirm.  If he doesn’t have commit rights, could you 
commit it for us after he adds the derange pattern-matching directives?  
Alternatively, if it’s easy, please feel free to add the directives and commit. 
 

Your fast response is very helpful to us.  The co_reduce collective is the 
final feature that we hope to finish before taking a snapshot of OpenCoarrays 
and designating it as our 1.0.0 release. :)   Toward that end, could this fix 
also be applied to the GCC 5 branch?  Soon after posting a tar ball of 
OpenCoarrays 1.0.0 on the web, Alessandro will submit a Portfile to the 
Macports developers that will enable OS X users to install OpenCoarrays 1.0.0 
for use with GCC 5. It would be great for Macports users to have co_reduce when 
the next GCC 5 update releases.

Damian



[PATCH, MIPS] Scheduling for M51xx core family

2015-07-16 Thread Robert Suchanek
Hi,

Another patch with a pipeline description but for M51xx cores with
two new options introduced: -march={m5100,m5101}. The M5101 is essentially
the same as M5100 but mapped to -msoft-float.

Ok to apply?

Regards,
Robert

2015-07-16  Prachi Godbole  

gcc/

* config/mips/m5100.md: New file.
* config/mips/mips-cpus.def (m5100, m5101): Define.
* config/mips/mips-tables.opt: Regenerate.
* config/mips/mips.c (mips_rtx_cost_data): Add costs for m5100.
* config/mips/mips.h (MIPS_ISA_LEVEL_SPEC): Map -march=m5100 and
-march=m5101 to -mips32r5.
(MIPS_ARCH_FLOAT_SPEC): Map -m5101 to -msoft-float.
(MIPS_ISA_NAN2008_SPEC): Map -march=m51* to -mnan=2008 if
!-msoft-float.
* config/mips/mips.md: Include m5100.md.
(processor): Add m5100.
* doc/invoke.texi (-march=@var{arch}): Add m5100, m5101.
---
 gcc/config/mips/m5100.md| 220 
 gcc/config/mips/mips-cpus.def   |   2 +
 gcc/config/mips/mips-tables.opt |  40 
 gcc/config/mips/mips.c  |  13 +++
 gcc/config/mips/mips.h  |   7 +-
 gcc/config/mips/mips.md |   2 +
 gcc/doc/invoke.texi |   1 +
 7 files changed, 265 insertions(+), 20 deletions(-)
 create mode 100644 gcc/config/mips/m5100.md

diff --git a/gcc/config/mips/m5100.md b/gcc/config/mips/m5100.md
new file mode 100644
index 000..f860eb2
--- /dev/null
+++ b/gcc/config/mips/m5100.md
@@ -0,0 +1,220 @@
+;; DFA-based pipeline description for MIPS32 models M5100.
+;;
+;; Copyright (C) 2015 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "m51_alu_pipe, m51_mdu_pipe, m51_fpu_pipe")
+(define_cpu_unit "m51_mul" "m51_mdu_pipe")
+(define_cpu_unit "m51_alu" "m51_alu_pipe")
+(define_cpu_unit "m51_fpu" "m51_fpu_pipe")
+
+;; --
+;; ALU Instructions
+;; --
+
+;; ALU: Logicals
+(define_insn_reservation "m51_int_logical" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "logical,move,signext,slt"))
+  "m51_alu")
+
+;; Arithmetics
+(define_insn_reservation "m51_int" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "arith,const,shift,clz"))
+  "m51_alu")
+
+(define_insn_reservation "m51_int_nop" 0
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "nop"))
+  "nothing")
+
+;; Conditional move
+(define_insn_reservation "m51_int_cmove" 1
+  (and (eq_attr "cpu" "m5100")
+   (and (eq_attr "type" "condmove")
+   (eq_attr "mode" "SI,DI")))
+  "m51_alu")
+
+;; Call
+(define_insn_reservation "m51_int_call" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "call"))
+  "m51_alu")
+
+;; branch/jump
+(define_insn_reservation "m51_int_jump" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "branch,jump"))
+  "m51_alu")
+
+;; loads: lb, lbu, lh, lhu, ll, lw, lwl, lwr, lwpc, lwxs
+;; prefetch: prefetch, prefetchx
+(define_insn_reservation "m51_int_load" 3
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "load,prefetch,prefetchx"))
+  "m51_alu")
+
+;; stores
+(define_insn_reservation "m51_int_store" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "store"))
+  "m51_alu")
+
+;; --
+;; MDU Instructions
+;; --
+
+;; High performance fully pipelined multiplier
+;; MULT to HI/LO
+(define_insn_reservation "m51_int_mult" 2
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "imul,imadd"))
+  "m51_alu+m51_mul*2")
+
+;; MUL to GPR
+(define_insn_reservation "m51_int_mul3" 2
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "imul3"))
+  "(m51_alu*2)+(m51_mul*2)")
+
+;; mfhi, mflo
+(define_insn_reservation "m51_int_mfhilo" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "mfhi,mflo"))
+  "m51_mul")
+
+;; mthi, mtlo
+(define_insn_reservation "m51_int_mthilo" 1
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "mthi,mtlo"))
+  "m51_mul")
+
+;; div
+(define_insn_reservation "m51_int_div_si" 34
+  (and (eq_attr "cpu" "m5100")
+   (eq_attr "type" "idiv"))
+  "m51_alu+m51_mul*34")
+
+;; --
+;; Floating Point Instructions
+;; -

[PATCH, MIPS] I6400 scheduling

2015-07-16 Thread Robert Suchanek
Hi,

This patch adds a pipeline description for the I6400 processor with -mips32r6
and -mips64r6 defaulted to this description.

Regtested with mips-img-linux-gnu. mips-tables.opt will be regenerated before
committing depending on which patch from the series goes in first.

Ok to apply?

Regards,
Robert

2015-07-16  Prachi Godbole  

gcc/
* config/mips/i6400.md: New file.
* config/mips/mips-cpus.def (mips32r6): Change to PROCESSOR_I6400.
(mips64r6): Likewise.
(i6400): Define.
* config/mips/mips-tables.opt: Regenerate.
* config/mips/mips.c (mips_rtx_cost_data): Add I6400 processor.
(mips_issue_rate): Add support for i6400.
(mips_multipass_dfa_lookahead): Likewise.
* config/mips/mips.h (TUNE_I6400): Define.
* config/mips/mips.md: Include i6400.md.
(processor): Add i6400.
* doc/invoke.texi (-march=@var{arch}): Add i6400.
---
 gcc/config/mips/i6400.md| 142 
 gcc/config/mips/mips-cpus.def   |   7 +-
 gcc/config/mips/mips-tables.opt |   3 +
 gcc/config/mips/mips.c  |  16 -
 gcc/config/mips/mips.h  |   3 +-
 gcc/config/mips/mips.md |   2 +
 gcc/doc/invoke.texi |   1 +
 7 files changed, 170 insertions(+), 4 deletions(-)
 create mode 100644 gcc/config/mips/i6400.md

diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
new file mode 100644
index 000..101a20c
--- /dev/null
+++ b/gcc/config/mips/i6400.md
@@ -0,0 +1,142 @@
+;; DFA-based pipeline description for I6400.
+;;
+;; Copyright (C) 2007-2015 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "i6400_int_pipe, i6400_mdu_pipe, i6400_fpu_short_pipe,
+  i6400_fpu_long_pipe")
+
+(define_cpu_unit "i6400_gpmuldiv" "i6400_mdu_pipe")
+(define_cpu_unit "i6400_agen, i6400_alu1, i6400_lsu" "i6400_int_pipe")
+(define_cpu_unit "i6400_control, i6400_ctu, i6400_alu0" "i6400_int_pipe")
+
+;; Short FPU pipeline.
+(define_cpu_unit "i6400_fpu_short" "i6400_fpu_short_pipe")
+
+;; Long FPU pipeline.
+(define_cpu_unit "i6400_fpu_long, i6400_fpu_apu" "i6400_fpu_long_pipe")
+
+(define_reservation "i6400_control_ctu" "i6400_control, i6400_ctu")
+(define_reservation "i6400_control_alu0" "i6400_control, i6400_alu0")
+(define_reservation "i6400_agen_lsu" "i6400_agen, i6400_lsu")
+(define_reservation "i6400_agen_alu1" "i6400_agen, i6400_alu1")
+
+;;
+;; FPU pipe
+;;
+
+;; fabs, fneg
+(define_insn_reservation "i6400_fpu_fabs" 1
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fabs,fneg,fmove"))
+  "i6400_fpu_short, i6400_fpu_apu")
+
+;; fadd, fsub, fcvt
+(define_insn_reservation "i6400_fpu_fadd" 4
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fadd, fcvt"))
+  "i6400_fpu_long, i6400_fpu_apu")
+
+;; fmul
+(define_insn_reservation "i6400_fpu_fmul" 5
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fmul"))
+  "i6400_fpu_long, i6400_fpu_apu")
+
+;; div, sqrt (Double Precision)
+(define_insn_reservation "i6400_fpu_div_df" 30
+  (and (eq_attr "cpu" "i6400")
+   (and (eq_attr "mode" "DF")
+   (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")))
+  "i6400_fpu_long+i6400_fpu_apu*30")
+
+;; div, sqrt (Single Precision)
+(define_insn_reservation "i6400_fpu_div_sf" 22
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt"))
+  "i6400_fpu_long+i6400_fpu_apu*22")
+
+;;
+;; Integer pipe
+;;
+
+;; and, lui, shifts, seb, seh
+(define_insn_reservation "i6400_int_logical" 1
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "move_type" "logical,const,andi,sll0,signext"))
+  "i6400_control_alu0 | i6400_agen_alu1")
+
+;; addi, addiu, ori, xori, add, addu, sub, nor
+(define_insn_reservation "i6400_int_add" 1
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "alu_type" "add,sub,or,xor,nor"))
+  "i6400_control_alu0 | i6400_agen_alu1")
+
+;; shifts, clo, clz, cond move, arith
+(define_insn_reservation "i6400_int_arith" 1
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "shift,slt,move,clz,condmove,arith"))
+  "i6400_control_alu0 | i6400_agen_alu1")
+
+;; nop
+(define_insn_reservation "i6400_int_nop" 0
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "nop"))
+  "nothing")
+
+;; mult, multu, mul
+(define_insn_reservation "i6400_int_mult" 4
+  (and (eq_attr "cpu" "i6400")
+

[PATCH, MIPS] Add -march=interaptiv

2015-07-16 Thread Robert Suchanek
Hi,

As in the title, the attached patch adds -march=interaptiv defined to 24kf2_1,
mapped to -mips32r2 and -mdsp. 

OK to apply?

Regards,
Robert

gcc/
* config/mips/mips-cpus.def (interaptiv): Define.
* config/mips/mips-tables.opt: Regenerate.
* config/mips/mips.h (MIPS_ISA_LEVEL_SPEC): Map -march=interaptiv to
-mips32r2.
(BASE_DRIVER_SELF_SPECS): Likewise but map to -mdsp.
* doc/invoke.texi (-march=@var{arch}): Add interaptiv.
---
 gcc/config/mips/mips-cpus.def   |  2 ++
 gcc/config/mips/mips-tables.opt | 39 +--
 gcc/config/mips/mips.h  |  6 --
 gcc/doc/invoke.texi |  1 +
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/gcc/config/mips/mips-cpus.def b/gcc/config/mips/mips-cpus.def
index fb4bae0..63a0d6e 100644
--- a/gcc/config/mips/mips-cpus.def
+++ b/gcc/config/mips/mips-cpus.def
@@ -147,6 +147,8 @@ MIPS_CPU ("1004kf2_1", PROCESSOR_24KF2_1, 33, 0)
 MIPS_CPU ("1004kf", PROCESSOR_24KF2_1, 33, 0)
 MIPS_CPU ("1004kf1_1", PROCESSOR_24KF1_1, 33, 0)
 
+MIPS_CPU ("interaptiv", PROCESSOR_24KF2_1, 33, 0)
+
 /* MIPS32 Release 5 processors.  */
 MIPS_CPU ("p5600", PROCESSOR_P5600, 36, PTF_AVOID_BRANCHLIKELY)
 
diff --git a/gcc/config/mips/mips-tables.opt b/gcc/config/mips/mips-tables.opt
index 59124a6..8c6c4b1 100644
--- a/gcc/config/mips/mips-tables.opt
+++ b/gcc/config/mips/mips-tables.opt
@@ -631,56 +631,59 @@ EnumValue
 Enum(mips_arch_opt_value) String(r1004kf1_1) Value(84)
 
 EnumValue
-Enum(mips_arch_opt_value) String(p5600) Value(85) Canonical
+Enum(mips_arch_opt_value) String(interaptiv) Value(85) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(5kc) Value(86) Canonical
+Enum(mips_arch_opt_value) String(p5600) Value(86) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(r5kc) Value(86)
+Enum(mips_arch_opt_value) String(5kc) Value(87) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(5kf) Value(87) Canonical
+Enum(mips_arch_opt_value) String(r5kc) Value(87)
 
 EnumValue
-Enum(mips_arch_opt_value) String(r5kf) Value(87)
+Enum(mips_arch_opt_value) String(5kf) Value(88) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(20kc) Value(88) Canonical
+Enum(mips_arch_opt_value) String(r5kf) Value(88)
 
 EnumValue
-Enum(mips_arch_opt_value) String(r20kc) Value(88)
+Enum(mips_arch_opt_value) String(20kc) Value(89) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(sb1) Value(89) Canonical
+Enum(mips_arch_opt_value) String(r20kc) Value(89)
 
 EnumValue
-Enum(mips_arch_opt_value) String(sb1a) Value(90) Canonical
+Enum(mips_arch_opt_value) String(sb1) Value(90) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(sr71000) Value(91) Canonical
+Enum(mips_arch_opt_value) String(sb1a) Value(91) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(sr71k) Value(91)
+Enum(mips_arch_opt_value) String(sr71000) Value(92) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(xlr) Value(92) Canonical
+Enum(mips_arch_opt_value) String(sr71k) Value(92)
 
 EnumValue
-Enum(mips_arch_opt_value) String(loongson3a) Value(93) Canonical
+Enum(mips_arch_opt_value) String(xlr) Value(93) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(octeon) Value(94) Canonical
+Enum(mips_arch_opt_value) String(loongson3a) Value(94) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(octeon+) Value(95) Canonical
+Enum(mips_arch_opt_value) String(octeon) Value(95) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(octeon2) Value(96) Canonical
+Enum(mips_arch_opt_value) String(octeon+) Value(96) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(octeon3) Value(97) Canonical
+Enum(mips_arch_opt_value) String(octeon2) Value(97) Canonical
 
 EnumValue
-Enum(mips_arch_opt_value) String(xlp) Value(98) Canonical
+Enum(mips_arch_opt_value) String(octeon3) Value(98) Canonical
+
+EnumValue
+Enum(mips_arch_opt_value) String(xlp) Value(99) Canonical
 
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 37f5b54..505e111 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -722,7 +722,8 @@ struct mips_cpu_info {
|march=r1|march=r12000|march=r14000|march=r16000:-mips4} \
  %{march=mips32|march=4kc|march=4km|march=4kp|march=4ksc:-mips32} \
  %{march=mips32r2|march=m4k|march=4ke*|march=4ksd|march=24k* \
-   |march=34k*|march=74k*|march=m14k*|march=1004k*: -mips32r2} \
+   |march=34k*|march=74k*|march=m14k*|march=1004k* \
+   |march=interaptiv: -mips32r2} \
  %{march=mips32r3: -mips32r3} \
  %{march=mips32r5|march=p5600: -mips32r5} \
  %{march=mips32r6: -mips32r6} \
@@ -825,7 +826,8 @@ struct mips_cpu_info {
 #define BASE_DRIVER_SELF_SPECS \
   MIPS_ISA_NAN2008_SPEC,   \
   "%{!mno-dsp: \
- %{march=24ke*|march=34kc*|march=34kf*|march=34kx*|march=1004k*: -mdsp} \
+ %{march=24ke*|march=34kc*|march=34kf*|march=34kx*|march=1004k* \
+   |march=interaptiv: -mdsp} \
  %{march=74k*|march=m14ke*: %{!mno-

[fr30] Fix indirect_jump pattern

2015-07-16 Thread Richard Sandiford
The pattern was accepting a nonimediate_operand, using the C condition
to weed out certain types of memory, but was then using an "r" constraint
to force a register.  This patch makes the predicate match the constraint
and removes the C condition.

Tested by building fr30-elf and using:

int
foo (int i)
{
  __typeof(&&a) foo[] = { &&a, &&a, &&b, &&c };

 restart:
  goto *foo[i];

 a:
  return 1;

 b:
  i += 1;
  goto restart;

 c:
  return 2;
}

to triger an indirect jump (checked via -dp).  OK to install?

Thanks,
Richard


gcc/
* config/fr30/fr30.md (indirect_jump): Use pmode_register_operand
instead of nonimmediate_operand.  Remove C condiition.

Index: gcc/config/fr30/fr30.md
===
--- gcc/config/fr30/fr30.md 2015-06-22 14:02:15.165532334 +0100
+++ gcc/config/fr30/fr30.md 2015-07-13 19:31:50.552692732 +0100
@@ -1146,8 +1146,8 @@ (define_insn "jump"
 
 ;; Indirect jump through a register
 (define_insn "indirect_jump"
-  [(set (pc) (match_operand:SI 0 "nonimmediate_operand" "r"))]
-  "GET_CODE (operands[0]) != MEM || GET_CODE (XEXP (operands[0], 0)) != PLUS"
+  [(set (pc) (match_operand 0 "pmode_register_operand" "r"))]
+  ""
   "jmp%#\\t@%0"
   [(set_attr "delay_type" "delayed")]
 )



Re: [PATCH 6/6] Migrate ipa-pure-const to function_summary.

2015-07-16 Thread Martin Liška
On 07/09/2015 10:47 PM, Martin Liška wrote:
> On 07/09/2015 07:44 PM, Jeff Law wrote:
>> On 07/09/2015 03:13 AM, mliska wrote:
>>> gcc/ChangeLog:
>>>
>>> 2015-07-03  Martin Liska  
>>>
>>> * ipa-pure-const.c (struct funct_state_d): New.
>>> (funct_state_d::default_p): Likewise.
>>> (has_function_state): Remove.
>>> (get_function_state): Likewise.
>>> (set_function_state): Likewise.
>>> (add_new_function): Rename and port to ::insert.
>>> (duplicate_node_data): Rename and port to ::duplicate.
>>> (funct_state_summary_t::duplicate): New function.
>>> (register_hooks): Remove hook registration.
>>> (pure_const_generate_summary): Use new data structure.
>>> (pure_const_write_summary): Likewise.
>>> (pure_const_read_summary): Likewise.
>>> (propagate_pure_const): Likewise.
>>> (propagate_nothrow): Likewise.
>>> (execute): Remove hook usage.
>>> (pass_ipa_pure_const::pass_ipa_pure_const): Likewise.
>>> ---
>>> @@ -84,6 +85,18 @@ const char *pure_const_names[3] = {"const", "pure", 
>>> "neither"};
>>>  decl.  */
>>>   struct funct_state_d
>>>   {
>>> +  funct_state_d (): pure_const_state (IPA_NEITHER),
>>> +state_previously_known (IPA_NEITHER), looping_previously_known (true),
>>> +looping (true), can_throw (true), can_free (true) {}
>>> +
>>> +  funct_state_d (const funct_state_d &s): pure_const_state 
>>> (s.pure_const_state),
>>> +state_previously_known (s.state_previously_known),
>>> +looping_previously_known (s.looping_previously_known),
>>> +looping (s.looping), can_throw (s.can_throw), can_free (s.can_free) {}
>>> +
>>> +  /* Return true if the value is default.  */
>>> +  bool default_p ();
>>> +
>>> /* See above.  */
>>> enum pure_const_state_e pure_const_state;
>>> /* What user set here; we can be always sure about this.  */
>> Doesn't this need to be a "class" rather then a "struct"?
>>
>>
>> OK with that change.
>>
>> jeff
> 
> Yeah.
> 
> 'class' will be more appropriate. As I'm going to be AFK for Friday and 
> upcoming weekend,
> I will install these patches on Monday.
> 
> Thanks,
> Martin

Hello.

v2 of the patch.

Thanks,
Martin
>From a17b6f69d944894ffda2f8db93577014dde070b9 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 9 Jul 2015 11:13:55 +0200
Subject: [PATCH 4/4] Port ipa-pure-const to function_summary.

gcc/ChangeLog:

2015-07-03  Martin Liska  

	* ipa-pure-const.c (struct funct_state_d): New.
	(funct_state_d::default_p): Likewise.
	(has_function_state): Remove.
	(get_function_state): Likewise.
	(set_function_state): Likewise.
	(add_new_function): Rename and port to ::insert.
	(duplicate_node_data): Rename and port to ::duplicate.
	(funct_state_summary_t::duplicate): New function.
	(register_hooks): Remove hook registration.
	(pure_const_generate_summary): Use new data structure.
	(pure_const_write_summary): Likewise.
	(pure_const_read_summary): Likewise.
	(propagate_pure_const): Likewise.
	(propagate_nothrow): Likewise.
	(execute): Remove hook usage.
	(pass_ipa_pure_const::pass_ipa_pure_const): Likewise.
---
 gcc/ipa-pure-const.c | 168 +--
 1 file changed, 55 insertions(+), 113 deletions(-)

diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index f0373e6..90f1c9f 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "opts.h"
 #include "varasm.h"
+#include "symbol-summary.h"
 
 /* Lattice values for const and pure functions.  Everything starts out
being const, then may drop to pure and then neither depending on
@@ -82,8 +83,18 @@ const char *pure_const_names[3] = {"const", "pure", "neither"};
 
 /* Holder for the const_state.  There is one of these per function
decl.  */
-struct funct_state_d
+class funct_state_d
 {
+public:
+  funct_state_d (): pure_const_state (IPA_NEITHER),
+state_previously_known (IPA_NEITHER), looping_previously_known (true),
+looping (true), can_throw (true), can_free (true) {}
+
+  funct_state_d (const funct_state_d &s): pure_const_state (s.pure_const_state),
+state_previously_known (s.state_previously_known),
+looping_previously_known (s.looping_previously_known),
+looping (s.looping), can_throw (s.can_throw), can_free (s.can_free) {}
+
   /* See above.  */
   enum pure_const_state_e pure_const_state;
   /* What user set here; we can be always sure about this.  */
@@ -105,20 +116,25 @@ struct funct_state_d
   bool can_free;
 };
 
-/* State used when we know nothing about function.  */
-static struct funct_state_d varying_state
-   = { IPA_NEITHER, IPA_NEITHER, true, true, true, true };
-
-
 typedef struct funct_state_d * funct_state;
 
 /* The storage of the funct_state is abstracted because there is the
possibility that it may be desirable to move this to the cgraph
local info.  */
 
-/* Array, indexed by cgraph node uid, of function states.  */
+class funct_state_summary

Re: [PATCH 5/6] Port IPA reference to function_summary infrastructure.

2015-07-16 Thread Martin Liška
On 07/10/2015 03:30 PM, Martin Jambor wrote:
> Hi,
> 
> I've spotted a likely typo:
> 
> On Thu, Jul 09, 2015 at 11:13:53AM +0200, Martin Liska wrote:
>> gcc/ChangeLog:
>>
>> 2015-07-03  Martin Liska  
>>
>>  * ipa-reference.c (ipa_ref_opt_summary_t): New class.
>>  (get_reference_optimization_summary): Use it.
>>  (set_reference_optimization_summary): Likewise.
>>  (ipa_init): Remove hook holders usage.
>>  (ipa_reference_c_finalize): Likewise.
>>  (ipa_ref_opt_summary_t::duplicate): New function.
>>  (ipa_ref_opt_summary_t::remove): Likewise.
>>  (propagate): Allocate the summary if does not exist.
>>  (ipa_reference_read_optimization_summary): Likewise.
>>  (struct ipa_reference_vars_info_d): Add new method.
>>  (struct ipa_reference_optimization_summary_d): Likewise.
>>  (get_reference_vars_info): Use new underlying container.
>>  (set_reference_vars_info): Remove.
>>  (init_function_info): Set up the container.
>> ---
>>  gcc/ipa-reference.c | 203 
>> ++--
>>  1 file changed, 102 insertions(+), 101 deletions(-)
>>
>> diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
>> index 465a74b..2afd9ad 100644
>> --- a/gcc/ipa-reference.c
>> +++ b/gcc/ipa-reference.c
> 
> ...
> 
>> @@ -837,12 +839,14 @@ propagate (void)
>>  }
>>  }
>>  
>> +  if (ipa_ref_opt_sum_summaries == NULL)
>> +ipa_ref_opt_sum_summaries = new ipa_ref_opt_summary_t (symtab);
>> +
>>/* Cleanup. */
>>FOR_EACH_DEFINED_FUNCTION (node)
>>  {
>>ipa_reference_vars_info_t node_info;
>>ipa_reference_global_vars_info_t node_g;
>> -  ipa_reference_optimization_summary_t opt;
>>  
>>node_info = get_reference_vars_info (node);
>>if (!node->alias && opt_for_fn (node->decl, flag_ipa_reference)
>> @@ -851,8 +855,8 @@ propagate (void)
>>  {
>>node_g = &node_info->global;
>>  
>> -  opt = XCNEW (struct ipa_reference_optimization_summary_d);
>> -  set_reference_optimization_summary (node, opt);
>> +  ipa_reference_optimization_summary_d *opt =
>> +ipa_ref_opt_sum_summaries->get (node);
>>  
>>/* Create the complimentary sets.  */
>>  
>> @@ -880,14 +884,20 @@ propagate (void)
>>node_g->statics_written);
>>  }
>>  }
>> -  free (node_info);
>> }
>>  
>>ipa_free_postorder_info ();
>>free (order);
>>  
>>bitmap_obstack_release (&local_info_obstack);
>> -  ipa_reference_vars_vector.release ();
>> +
>> +  if (ipa_ref_var_info_summaries == NULL)
> 
> I assume you meant != NULL here.
> 
>> +{
>> +  delete ipa_ref_var_info_summaries;
>> +  ipa_ref_var_info_summaries = NULL;
>> +}
>> +
>> +  ipa_ref_var_info_summaries = NULL;
>>if (dump_file)
>>  splay_tree_delete (reference_vars_to_consider);
>>reference_vars_to_consider = NULL;
> 
> Thanks,
> 
> Martin
> 

Hello

I send v2 of the patch.

Thanks,
Martin
>From 06877d19c6cf617730e188bd998926b0f9852cd3 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 9 Jul 2015 11:13:53 +0200
Subject: [PATCH 3/4] Port IPA reference to function_summary infrastructure.

gcc/ChangeLog:

2015-07-03  Martin Liska  

	* ipa-reference.c (ipa_ref_opt_summary_t): New class.
	(get_reference_optimization_summary): Use it.
	(set_reference_optimization_summary): Likewise.
	(ipa_init): Remove hook holders usage.
	(ipa_reference_c_finalize): Likewise.
	(ipa_ref_opt_summary_t::duplicate): New function.
	(ipa_ref_opt_summary_t::remove): Likewise.
	(propagate): Allocate the summary if does not exist.
	(ipa_reference_read_optimization_summary): Likewise.
	(struct ipa_reference_vars_info_d): Add new method.
	(struct ipa_reference_optimization_summary_d): Likewise.
	(get_reference_vars_info): Use new underlying container.
	(set_reference_vars_info): Remove.
	(init_function_info): Set up the container.
	(is_proper_for_analysis): Fix coding style.
	(write_node_summary_p): Likewise.
	(stream_out_bitmap): Likewise.
---
 gcc/ipa-reference.c | 214 
 1 file changed, 100 insertions(+), 114 deletions(-)

diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
index c00fca3..f35c66c 100644
--- a/gcc/ipa-reference.c
+++ b/gcc/ipa-reference.c
@@ -57,12 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "flags.h"
 #include "diagnostic.h"
 #include "data-streamer.h"
-
-static void remove_node_data (struct cgraph_node *node,
-			  void *data ATTRIBUTE_UNUSED);
-static void duplicate_node_data (struct cgraph_node *src,
- struct cgraph_node *dst,
- void *data ATTRIBUTE_UNUSED);
+#include "symbol-summary.h"
 
 /* The static variables defined within the compilation unit that are
loaded or stored directly by function that owns this structure.  */
@@ -92,9 +87,10 @@ struct ipa_reference_optimization_summary_d
   bitmap statics_not_written;
 };
 
-typedef struct ipa_reference_local_vars_info_d *ipa_

Re: [PATCH] Fix PR ipa/66896

2015-07-16 Thread Richard Biener
On Thu, Jul 16, 2015 at 3:39 PM, Martin Liška  wrote:
> Hello.
>
> Following patch fixes $subject, which can be spotted on gcc-5-branch, while 
> trunk
> looks fine (even though it can potentially suffer from the same issues).
>
> Patch can both survive regression tests on trunk and gcc-5-branch on 
> x86_64-linux-pc.
>
> Ready for both branches?

Ok.

Thanks,
Richard.

> Thanks,
> Martin


Re: [PATCH 4/6] Port ipa-cp to use cgraph_edge summary.

2015-07-16 Thread Martin Liška
On 07/10/2015 04:18 PM, Martin Jambor wrote:
> Hi,
> 
> I know the patch has been approved by Jeff, but please do not commit
> it before considering the following:
> 
> On Thu, Jul 09, 2015 at 11:13:53AM +0200, Martin Liska wrote:
>> gcc/ChangeLog:
>>
>> 2015-07-03  Martin Liska  
>>
>>  * ipa-cp.c (struct edge_clone_summary): New structure.
>>  (class edge_clone_summary_t): Likewise.
>>  (edge_clone_summary_t::initialize): New method.
>>  (edge_clone_summary_t::duplicate): Likewise.
>>  (get_next_cgraph_edge_clone): Remove.
>>  (get_info_about_necessary_edges): Refactor using the new
>>  data structure.
>>  (gather_edges_for_value): Likewise.
>>  (perhaps_add_new_callers): Likewise.
>>  (ipcp_driver): Allocate and deallocate newly added
>>  instance.
>> ---
>>  gcc/ipa-cp.c | 198 
>> ++-
>>  1 file changed, 113 insertions(+), 85 deletions(-)
>>
>> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> index 16b9cde..8a50b63 100644
>> --- a/gcc/ipa-cp.c
>> +++ b/gcc/ipa-cp.c
>> @@ -2888,54 +2888,79 @@ ipcp_discover_new_direct_edges (struct cgraph_node 
>> *node,
>>  inline_update_overall_summary (node);
>>  }
>>  
>> -/* Vector of pointers which for linked lists of clones of an original crgaph
>> -   edge. */
>> +/* Edge clone summary.  */
>>  
>> -static vec next_edge_clone;
>> -static vec prev_edge_clone;
>> -
>> -static inline void
>> -grow_edge_clone_vectors (void)
>> +struct edge_clone_summary
> 
> I's got constructors and destructors so it should be a class, reaally.
> 
>>  {
>> -  if (next_edge_clone.length ()
>> -  <=  (unsigned) symtab->edges_max_uid)
>> -next_edge_clone.safe_grow_cleared (symtab->edges_max_uid + 1);
>> -  if (prev_edge_clone.length ()
>> -  <=  (unsigned) symtab->edges_max_uid)
>> -prev_edge_clone.safe_grow_cleared (symtab->edges_max_uid + 1);
>> -}
>> +  /* Default constructor.  */
>> +  edge_clone_summary (): edge_set (NULL), edge (NULL) {}
>>  
>> -/* Edge duplication hook to grow the appropriate linked list in
>> -   next_edge_clone. */
>> +  /* Default destructor.  */
>> +  ~edge_clone_summary ()
>> +  {
>> +gcc_assert (edge_set != NULL);
>>  
>> -static void
>> -ipcp_edge_duplication_hook (struct cgraph_edge *src, struct cgraph_edge 
>> *dst,
>> -void *)
>> +if (edge != NULL)
>> +  {
>> +gcc_checking_assert (edge_set->contains (edge));
>> +edge_set->remove (edge);
>> +  }
>> +
>> +/* Release memory for an empty set.  */
>> +if (edge_set->elements () == 0)
>> +  delete edge_set;
>> +  }
>> +
>> +  hash_set  *edge_set;
>> +  cgraph_edge *edge;
> 
> If the hash set is supposed to replace the linked list of edge clones,
> then a removal mechanism seems to be missing.  The whole point of
> prev_edge_clone vector was to allow removal of edges from the linked
> list, because as speculative edges are thrown away, clones can be too
> and then we must remove the pointer from the list, or hash set.
> 
> Have you tried -O3 LTOing Firefox with these changes?
> 
> But I must say that I'm not convinced that converting the linked list
> into a hash_set is a good idea at all.  Apart from the self-removal
> operation, the lists are always traversed linearly and in full, so
> except for using a C++-style iterator, I really do not see any point.
> 
> Moreover, you seem to create a hash table for each and every edge,
> even when it has no clones, just to be able to enter the edge itself
> into it, and so not skip it when you iterate over all clones.  That
> really seems like unjustifiable overhead.  And the deletion in
> duplication hook is also very unappealing.  So the bottom line is that
> while I like turning the two vectors into a summary, I do not like the
> hash set at all.  If absolutely think it is a good idea, please make
> that change in a separate patch so that we can better argue about its
> merits.
> 
> On the other hand, since the summaries are hash-based themselves, it
> would be great if they had a predicate to find out whether there is
> any summary for a given edge at all and have get_next_cgraph_edge_clone
> return false if there was none.  That would actually save memory.
> 
> Thanks,
> 
> Martin
> 

After a discussion with Martin, we made a decision to preserve current 
implementation
and not to port the IPA CP to he newly introduced summary.

Martin


Re: [PATCH 2/6] Introduce new edge_summary class and replace ipa_edge_args_sum.

2015-07-16 Thread Martin Liška
On 07/10/2015 03:31 PM, Martin Jambor wrote:
> Hi,
> 
> thanks for working on this and sorry for a tad late review:
> 
> On Thu, Jul 09, 2015 at 11:13:52AM +0200, Martin Liska wrote:
>> gcc/ChangeLog:
>>
>> 2015-07-03  Martin Liska  
>>
>>  * cgraph.c (symbol_table::create_edge): Introduce summary_uid
>>  for cgraph_edge.
>>  * cgraph.h (struct GTY): Likewise.
> 
> struct GTY does not look right :-)
> 
>>  * ipa-inline-analysis.c (estimate_function_body_sizes): Use
>>  new data structure.
>>  * ipa-profile.c (ipa_profile): Likewise.
>>  * ipa-prop.c (ipa_print_node_jump_functions):
> 
>   Likewise.
> 
>>  (ipa_propagate_indirect_call_infos): Likewise.
>>  (ipa_free_edge_args_substructures): Likewise.
>>  (ipa_free_all_edge_args): Likewise.
>>  (ipa_edge_args_t::remove): Likewise.
>>  (ipa_edge_removal_hook): Likewise.
>>  (ipa_edge_args_t::duplicate): Likewise.
>>  (ipa_register_cgraph_hooks): Likewise.
>>  (ipa_unregister_cgraph_hooks): Likewise.
>>  * ipa-prop.h (ipa_check_create_edge_args): Likewise.
>>  (ipa_edge_args_info_available_for_edge_p): Likewise.
> 
> Definition of ipa_edge_args_t is missing here.
> 
>>  * symbol-summary.h (gt_ggc_mx): Indent properly.
>>  (gt_pch_nx): Likewise.
>>  (edge_summary): New class.
>> ---
>>  gcc/cgraph.c  |   2 +
>>  gcc/cgraph.h  |   5 +-
>>  gcc/ipa-inline-analysis.c |   2 +-
>>  gcc/ipa-profile.c |   2 +-
>>  gcc/ipa-prop.c|  71 +++-
>>  gcc/ipa-prop.h|  44 ++
>>  gcc/symbol-summary.h  | 208 
>> +-
>>  7 files changed, 252 insertions(+), 82 deletions(-)
>>
> 
> ...
> 
>> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
>> index e6725aa..f0af9b2 100644
>> --- a/gcc/ipa-prop.h
>> +++ b/gcc/ipa-prop.h
>> @@ -493,13 +493,36 @@ public:
>>  extern ipa_node_params_t *ipa_node_params_sum;
>>  /* Vector of IPA-CP transformation data for each clone.  */
>>  extern GTY(()) vec 
>> *ipcp_transformations;
>> -/* Vector where the parameter infos are actually stored. */
>> -extern GTY(()) vec *ipa_edge_args_vector;
>> +
>> +/* Function summary for ipa_node_params.  */
>> +class GTY((user)) ipa_edge_args_t: public edge_summary 
>> +{
>> +public:
>> +  ipa_edge_args_t (symbol_table *symtab):
>> +edge_summary  (symtab, true) { }
>> +
>> +  static ipa_edge_args_t *create_ggc (symbol_table *symtab)
>> +  {
> 
> Please move the body of this function to where the bodies of the rest
> of the member functions are.
> 
>> +ipa_edge_args_t *summary = new (ggc_cleared_alloc  ())
>> +  ipa_edge_args_t (symtab);
>> +return summary;
>> +  }
>> +
>> +  /* Hook that is called by summary when a node is duplicated.  */
>> +  virtual void duplicate (cgraph_edge *edge,
>> +  cgraph_edge *edge2,
>> +  ipa_edge_args *data,
>> +  ipa_edge_args *data2);
>> +
>> +  virtual void remove (cgraph_edge *edge, ipa_edge_args *data);
>> +};
>> +
>> +extern GTY(()) edge_summary  *ipa_edge_args_sum;
>>  
>>  /* Return the associated parameter/argument info corresponding to the given
>> node/edge.  */
>>  #define IPA_NODE_REF(NODE) (ipa_node_params_sum->get (NODE))
>> -#define IPA_EDGE_REF(EDGE) (&(*ipa_edge_args_vector)[(EDGE)->uid])
>> +#define IPA_EDGE_REF(EDGE) (ipa_edge_args_sum->get (EDGE))
>>  /* This macro checks validity of index returned by
>> ipa_get_param_decl_index function.  */
>>  #define IS_VALID_JUMP_FUNC_INDEX(I) ((I) != -1)
>> @@ -532,19 +555,8 @@ ipa_check_create_node_params (void)
>>  static inline void
>>  ipa_check_create_edge_args (void)
>>  {
>> -  if (vec_safe_length (ipa_edge_args_vector)
>> -  <= (unsigned) symtab->edges_max_uid)
>> -vec_safe_grow_cleared (ipa_edge_args_vector, symtab->edges_max_uid + 1);
>> -}
>> -
>> -/* Returns true if the array of edge infos is large enough to accommodate an
>> -   info for EDGE.  The main purpose of this function is that debug dumping
>> -   function can check info availability without causing reallocations.  */
>> -
>> -static inline bool
>> -ipa_edge_args_info_available_for_edge_p (struct cgraph_edge *edge)
>> -{
>> -  return ((unsigned) edge->uid < vec_safe_length (ipa_edge_args_vector));
>> +  if (ipa_edge_args_sum == NULL)
>> +ipa_edge_args_sum = ipa_edge_args_t::create_ggc (symtab);
>>  }
>>  
>>  static inline ipcp_transformation_summary *
>> diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
>> index eefbfd9..5799443 100644
>> --- a/gcc/symbol-summary.h
>> +++ b/gcc/symbol-summary.h
>> @@ -108,7 +108,7 @@ public:
>>/* Allocates new data that are stored within map.  */
>>T* allocate_new ()
>>{
>> -return m_ggc ? new (ggc_alloc  ()) T() : new T () ;
>> +return m_ggc ? new (ggc_alloc  ()) T () : new T () ;
>>}
>>  
>>/* Release an item that is stored within map.  */
>>

[committed] Rework read_md_rtx interface

2015-07-16 Thread Richard Sandiford
This patch gets read_md_rtx to return a structure that describes the rtx rather
than separate fields.  This includes the location of the start of the rtx as
a file_location.  A follow-on patch will use this to remove define_split and
define_peephole2 from the insn_code numbering.

This removes the last cases where diagnostics were reported against a line
rather than a file_location.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* read-md.h (message_with_line, error_with_line): Delete.
* read-md.c (message_with_line, error_with_line): Delete.
* gensupport.h: Include read-md.h.
(md_rtx_info): New structure.
(read_md_rtx): Use it.  Return a bool success value.
* gensupport.c (read_md_rtx): Likewise.
* genattr-common.c (gen_attr): Take an md_rtx_info rather than an rtx.
(main): Update after interface changes.
* genattr.c (gen_attr): Take an md_rtx_info rather than an rtx.
(main): Update after interface changes.
* genattrtab.c (insn_code_number): Delete.
(optimize_attrs): Add a max_insn_code parameter and use it instead
of insn_code_number.
(gen_attr): Take an md_rtx_info rather than an rtx and lineno.
Use *_at rather than *_with_line functions.
(gen_insn): Likewise.
(gen_delay): Likewise.
(gen_insn_reserv): Likewise.
(gen_bypass): Take an md_rtx_info rather than an rtx.
(main): Update after interface changes.  Use a local max_insn_code
variable instead of insn_code_number.
* genautomata.c (gen_cpu_unit): Take an md_rtx_info rather than
an rtx.  Use fatal_at rather than fatal.
(gen_query_cpu_unit, gen_bypass, gen_excl_set)
(gen_presence_absence_set, gen_presence_set, gen_final_presence_set)
(gen_absence_set, gen_final_absence_set, gen_automaton)
(gen_automata_option, gen_reserv, gen_insn_reserv): Likewise.
(main): Update after interface changes.
* gencodes.c (gen_insn): Take an md_rtx_info rather than an rtx
and code number.
(main): Update after interface changes.
* genconditions.c (main): Use new read_md_rtx interface.
* genconfig.c (gen_insn): Take an md_rtx_info rather than an rtx.
(gen_expand, gen_split, gen_peephole, gen_peephole2): Likewise.
(main): Update after interface changes.
* genemit.c (insn_code_number, insn_index_number): Delete.
(gen_insn): Take an md_rtx_info rather than an rtx and lineno.
Use fatal_at rather than fatal.
(gen_expand): Take an md_rtx_info rather than an rtx.  Use fatal_at
rather than fatal.
(gen_split): Likewise.
(main): Update after interface changes.
* genextract.c (line_no): Delete.
(gen_insn): Take an md_rtx_info rather than an rtx and lineno.
Update call to walk_rtx.
(VEC_safe_set_locstr): Add an md_rtx_info argument.  Use message_at
rather than message_with_line.
(walk_rtx): Add an md_rtx_info argument.  Update call to
VEC_safe_set_locstr.
(main): Update after interface changes.
* genflags.c (gen_insn): Take an md_rtx_info rather than an rtx
and lineno.  Use error_at rather than separate message_with_line
calls and have_error assignments.
(main): Update after interface changes.
* genmddump.c (main): Use new read_md_rtx interface.
* genopinit.c (insn): Take an md_rtx_info rather than an rtx.
(main): Update after interface changes.
* genoutput.c (next_code_number): Delete.
(gen_insn): Take an md_rtx_info rather than an rtx and lineno.
(gen_peephole, gen_expand, gen_split): Likewise.
(note_constraint): Likewise.  Use *_at rather than *_with_line
functions.
(main): Update after interface changes.
* genpeep.c (gen_peephole): Take an md_rtx_info rather than an
rtx and lineno.
(main): Update after interface changes.
* genpreds.c (process_define_predicate): Take an md_rtx_info rather
than an rtx and lineno.
(process_define_constraint): Likewise.
(process_define_register_constraint): Likewise.
(main): Update after interface changes.
* genrecog.c (next_insn_code, pattern_lineno): Delete.
(validate_pattern): Replace top-level rtx with an md_rtx_info.
Use *_at rather than *_with_line functions.
(match_pattern_2): Likewise.
(match_pattern_1, match_pattern): Add an md_rtx_info parameter.
(get_peephole2_pattern): Take an md_rtx_info rather than an rtvec.
Use *_at rather than *_with_line functions.
* gentarget-def.c (add_insn): New function.
(mai

[committed] Avoid some calls to fatal() in genattrtab.c

2015-07-16 Thread Richard Sandiford
Another patch to make more use of fatal_at (which passes a file location)
rather than fatal (which gives no indication where the error was).

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* genattrtab.c (make_canonical): Add a file_location parameter.
Use fatal_at rather than fatal.
(get_attr_value): Likewise.  Update call to make_canonical.
(fill_attr, make_length_attrs, optimize_attrs, gen_attr)
(make_internal_attr): Update calls accordingly.

Index: gcc/genattrtab.c
===
--- gcc/genattrtab.c2015-07-12 14:18:26.677112366 +0100
+++ gcc/genattrtab.c2015-07-12 14:18:26.69746 +0100
@@ -1171,10 +1171,10 @@ check_defs (void)
 /* Given a valid expression for an attribute value, remove any IF_THEN_ELSE
expressions by converting them into a COND.  This removes cases from this
program.  Also, replace an attribute value of "*" with the default attribute
-   value.  */
+   value.  LOC is the location to use for error reporting.  */
 
 static rtx
-make_canonical (struct attr_desc *attr, rtx exp)
+make_canonical (file_location loc, struct attr_desc *attr, rtx exp)
 {
   int i;
   rtx newexp;
@@ -1189,7 +1189,7 @@ make_canonical (struct attr_desc *attr,
   if (! strcmp (XSTR (exp, 0), "*"))
{
  if (attr->default_val == 0)
-   fatal ("(attr_value \"*\") used in invalid context");
+   fatal_at (loc, "(attr_value \"*\") used in invalid context");
  exp = attr->default_val->value;
}
   else
@@ -1225,14 +1225,14 @@ make_canonical (struct attr_desc *attr,
 
/* First, check for degenerate COND.  */
if (XVECLEN (exp, 0) == 0)
- return make_canonical (attr, XEXP (exp, 1));
-   defval = XEXP (exp, 1) = make_canonical (attr, XEXP (exp, 1));
+ return make_canonical (loc, attr, XEXP (exp, 1));
+   defval = XEXP (exp, 1) = make_canonical (loc, attr, XEXP (exp, 1));
 
for (i = 0; i < XVECLEN (exp, 0); i += 2)
  {
XVECEXP (exp, 0, i) = copy_boolean (XVECEXP (exp, 0, i));
XVECEXP (exp, 0, i + 1)
- = make_canonical (attr, XVECEXP (exp, 0, i + 1));
+ = make_canonical (loc, attr, XVECEXP (exp, 0, i + 1));
if (! rtx_equal_p (XVECEXP (exp, 0, i + 1), defval))
  allsame = 0;
  }
@@ -1275,19 +1275,21 @@ copy_boolean (rtx exp)
`insn_code' is the code of an insn whose attribute has the specified
value (-2 if not processing an insn).  We ensure that all insns for
a given value have the same number of alternatives if the value checks
-   alternatives.  */
+   alternatives.  LOC is the location to use for error reporting.  */
 
 static struct attr_value *
-get_attr_value (rtx value, struct attr_desc *attr, int insn_code)
+get_attr_value (file_location loc, rtx value, struct attr_desc *attr,
+   int insn_code)
 {
   struct attr_value *av;
   uint64_t num_alt = 0;
 
-  value = make_canonical (attr, value);
+  value = make_canonical (loc, attr, value);
   if (compares_alternatives_p (value))
 {
   if (insn_code < 0 || insn_alternatives == NULL)
-   fatal ("(eq_attr \"alternatives\" ...) used in non-insn context");
+   fatal_at (loc, "(eq_attr \"alternatives\" ...) used in non-insn"
+ " context");
   else
num_alt = insn_alternatives[insn_code];
 }
@@ -1439,7 +1441,7 @@ fill_attr (struct attr_desc *attr)
   if (value == NULL)
av = attr->default_val;
   else
-   av = get_attr_value (value, attr, id->insn_code);
+   av = get_attr_value (id->loc, value, attr, id->insn_code);
 
   ie = oballoc (struct insn_ent);
   ie->def = id;
@@ -1552,7 +1554,7 @@ make_length_attrs (void)
 return;
 
   if (! length_attr->is_numeric)
-fatal ("length attribute must be numeric");
+fatal_at (length_attr->loc, "length attribute must be numeric");
 
   length_attr->is_const = 0;
   length_attr->is_special = 1;
@@ -1568,7 +1570,8 @@ make_length_attrs (void)
   for (av = length_attr->first_value; av; av = av->next)
for (ie = av->first_insn; ie; ie = ie->next)
  {
-   new_av = get_attr_value (substitute_address (av->value,
+   new_av = get_attr_value (ie->def->loc,
+substitute_address (av->value,
 no_address_fn[i],
 address_fn[i]),
 new_attr, ie->def->insn_code);
@@ -3041,7 +3044,8 @@ optimize_attrs (int max_insn_code)
{
  newexp = attr_copy_rtx (newexp);
  remove_insn_ent (av, ie);
-   

[committed] Use file_location in genpreds.c and compute_test_codes

2015-07-16 Thread Richard Sandiford
Make genpreds.c use a file_location rather than a plain lineno.  This means
that it will cope with changes in filename, e.g. when reporting contradictions
between two define_constraints.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* gensupport.h (compute_test_codes): Take a file_location rather
than a line number.
* gensupport.c (compute_test_codes): Likewise.  Use *_at functions
rather than *_with_line functions.
(process_define_predicate): Update call to compute_test_codes.
* genpreds.c (validate_exp): Take a file_location rather than a
line number.  Use *_at functions rather than *_with_line functions.
(process_define_predicate): Update call to validate_exp.
(constraint_data): Replace lineno field with a file_location.
(add_constraint): Take a file_location rather than a line number.
Use *_at functions rather than *_with_line functions.  Fix error
message for address constraints.  Update after changes to
validate_exp, constraint_data and compute_test_codes.
(process_define_constraint): Update accordingly.
(process_define_register_constraint): Likewise.

Index: gcc/gensupport.h
===
--- gcc/gensupport.h2015-07-12 14:18:07.747909172 +0100
+++ gcc/gensupport.h2015-07-12 20:53:11.543868618 +0100
@@ -110,7 +110,7 @@ struct pattern_stats
 };
 
 extern void get_pattern_stats (struct pattern_stats *ranges, rtvec vec);
-extern void compute_test_codes (rtx, int, char *);
+extern void compute_test_codes (rtx, file_location, char *);
 extern const char *get_emit_function (rtx);
 extern bool needs_barrier_p (rtx);
 
Index: gcc/gensupport.c
===
--- gcc/gensupport.c2015-07-12 14:18:26.640114005 +0100
+++ gcc/gensupport.c2015-07-12 20:53:11.543868618 +0100
@@ -208,11 +208,11 @@ #define TRISTATE_NOT(a)   \
 static char did_you_mean_codes[NUM_RTX_CODE];
 
 /* Recursively calculate the set of rtx codes accepted by the
-   predicate expression EXP, writing the result to CODES.  LINENO is
-   the line number on which the directive containing EXP appeared.  */
+   predicate expression EXP, writing the result to CODES.  LOC is
+   the .md file location of the directive containing EXP.  */
 
 void
-compute_test_codes (rtx exp, int lineno, char *codes)
+compute_test_codes (rtx exp, file_location loc, char *codes)
 {
   char op0_codes[NUM_RTX_CODE];
   char op1_codes[NUM_RTX_CODE];
@@ -222,29 +222,29 @@ compute_test_codes (rtx exp, int lineno,
   switch (GET_CODE (exp))
 {
 case AND:
-  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
+  compute_test_codes (XEXP (exp, 0), loc, op0_codes);
+  compute_test_codes (XEXP (exp, 1), loc, op1_codes);
   for (i = 0; i < NUM_RTX_CODE; i++)
codes[i] = TRISTATE_AND (op0_codes[i], op1_codes[i]);
   break;
 
 case IOR:
-  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
+  compute_test_codes (XEXP (exp, 0), loc, op0_codes);
+  compute_test_codes (XEXP (exp, 1), loc, op1_codes);
   for (i = 0; i < NUM_RTX_CODE; i++)
codes[i] = TRISTATE_OR (op0_codes[i], op1_codes[i]);
   break;
 case NOT:
-  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
+  compute_test_codes (XEXP (exp, 0), loc, op0_codes);
   for (i = 0; i < NUM_RTX_CODE; i++)
codes[i] = TRISTATE_NOT (op0_codes[i]);
   break;
 
 case IF_THEN_ELSE:
   /* a ? b : c  accepts the same codes as (a & b) | (!a & c).  */
-  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
-  compute_test_codes (XEXP (exp, 2), lineno, op2_codes);
+  compute_test_codes (XEXP (exp, 0), loc, op0_codes);
+  compute_test_codes (XEXP (exp, 1), loc, op1_codes);
+  compute_test_codes (XEXP (exp, 2), loc, op2_codes);
   for (i = 0; i < NUM_RTX_CODE; i++)
codes[i] = TRISTATE_OR (TRISTATE_AND (op0_codes[i], op1_codes[i]),
TRISTATE_AND (TRISTATE_NOT (op0_codes[i]),
@@ -268,7 +268,7 @@ compute_test_codes (rtx exp, int lineno,
 
if (*next_code == '\0')
  {
-   error_with_line (lineno, "empty match_code expression");
+   error_at (loc, "empty match_code expression");
break;
  }
 
@@ -287,17 +287,16 @@ compute_test_codes (rtx exp, int lineno,
}
if (!found_it)
  {
-   error_with_line (lineno,
- 

[PATCH] libgcc: fix build with older make

2015-07-16 Thread Jan Beulich
Make up to 3.80 (documented as minimal permitted version) doesn't
support "else if...".

2015-07-16  Jan Beulich  

* config/t-softfp: Split up "else ifneq".

--- a/libgcc/config/t-softfp
+++ b/libgcc/config/t-softfp
@@ -103,7 +103,8 @@ ifeq ($(enable_shared),yes)
fi
 endif
echo '#endif' >> $@
-else ifneq ($(softfp_wrap_start),)
+else
+ifneq ($(softfp_wrap_start),)
 softfp_file_list := $(addsuffix .c,$(softfp_func_list))
 
 $(softfp_file_list):
@@ -114,6 +115,7 @@ else
 softfp_file_list := \
   $(addsuffix .c,$(addprefix $(srcdir)/soft-fp/,$(softfp_func_list)))
 endif
+endif
 
 # Disable missing prototype and type limit warnings.  The prototypes
 # for the functions in the soft-fp files have not been brought across



libgcc: fix build with older make

Make up to 3.80 (documented as minimal permitted version) doesn't
support "else if...".

2015-07-16  Jan Beulich  

* config/t-softfp: Split up "else ifneq".

--- a/libgcc/config/t-softfp
+++ b/libgcc/config/t-softfp
@@ -103,7 +103,8 @@ ifeq ($(enable_shared),yes)
fi
 endif
echo '#endif' >> $@
-else ifneq ($(softfp_wrap_start),)
+else
+ifneq ($(softfp_wrap_start),)
 softfp_file_list := $(addsuffix .c,$(softfp_func_list))
 
 $(softfp_file_list):
@@ -114,6 +115,7 @@ else
 softfp_file_list := \
   $(addsuffix .c,$(addprefix $(srcdir)/soft-fp/,$(softfp_func_list)))
 endif
+endif
 
 # Disable missing prototype and type limit warnings.  The prototypes
 # for the functions in the soft-fp files have not been brought across


[committed] Use file_location to record positions in genoutput.c

2015-07-16 Thread Richard Sandiford
Use file_location to replace separate filename and line number
in genoutput.c.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* genoutput.c (data): Use a file_location to record the source
position.
(nothing): Delete.
(idata, idata_end): Remove initialization.
(constraint_data): Replace lineno with a file_location.
(output_insn_data): Update after changes to data.
(gen_insn, gen_peephole, gen_expand, gen_split): Likewise.
(scan_operands): Likewise, using *_at rather than *_with_line
functions.
(process_template): Likewise.
(validate_insn_alternatives): Likewise.
(validate_insn_operands): Likewise.
(validate_optab_operands): Likewise.
(init_insn_for_nothing): Initialize idata and idata_end.
(note_constraint): Update after changes to constraint_data,
using at rather than with_line functions.
(mdep_constraint_len): Take a file_location rather than a
line number.  Use at rather than with_line functions.

Index: gcc/genoutput.c
===
--- gcc/genoutput.c 2015-07-12 14:18:26.631114404 +0100
+++ gcc/genoutput.c 2015-07-12 20:53:11.543868618 +0100
@@ -154,9 +154,8 @@ struct data
   struct data *next;
   const char *name;
   const char *template_code;
+  file_location loc;
   int code_number;
-  const char *filename;
-  int lineno;
   int n_generator_args;/* Number of arguments passed to 
generator */
   int n_operands;  /* Number of operands this insn recognizes */
   int n_dups;  /* Number times match_dup appears in pattern */
@@ -166,15 +165,12 @@ struct data
   struct operand_data operand[MAX_MAX_OPERANDS];
 };
 
-/* A dummy insn, for CODE_FOR_nothing.  */
-static struct data nothing;
-
 /* This variable points to the first link in the insn chain.  */
-static struct data *idata = ¬hing;
+static struct data *idata;
 
 /* This variable points to the end of the insn chain.  This is where
everything relevant from the machien description is appended to.  */
-static struct data **idata_end = ¬hing.next;
+static struct data **idata_end;
 
 
 static void output_prologue (void);
@@ -196,9 +192,9 @@ static void gen_split (rtx, int);
 struct constraint_data
 {
   struct constraint_data *next_this_letter;
-  int lineno;
+  file_location loc;
   unsigned int namelen;
-  const char name[1];
+  char name[1];
 };
 
 /* All machine-independent constraint characters (except digits) that
@@ -208,7 +204,7 @@ static const char indep_constraints[] =
 static struct constraint_data *
 constraints_by_letter_table[1 << CHAR_BIT];
 
-static int mdep_constraint_len (const char *, int, int);
+static int mdep_constraint_len (const char *, file_location, int);
 static void note_constraint (rtx, int);
 
 static void
@@ -306,7 +302,7 @@ output_insn_data (void)
 
   for (d = idata; d; d = d->next)
 {
-  printf ("  /* %s:%d */\n", d->filename, d->lineno);
+  printf ("  /* %s:%d */\n", d->loc.filename, d->loc.lineno);
   printf ("  {\n");
 
   if (d->name)
@@ -449,11 +445,11 @@ scan_operands (struct data *d, rtx part,
   opno = XINT (part, 0);
   if (opno >= MAX_MAX_OPERANDS)
{
- error_with_line (d->lineno, "maximum number of operands exceeded");
+ error_at (d->loc, "maximum number of operands exceeded");
  return;
}
   if (d->operand[opno].seen)
-   error_with_line (d->lineno, "repeated operand number %d\n", opno);
+   error_at (d->loc, "repeated operand number %d\n", opno);
 
   d->operand[opno].seen = 1;
   d->operand[opno].mode = GET_MODE (part);
@@ -470,11 +466,11 @@ scan_operands (struct data *d, rtx part,
   opno = XINT (part, 0);
   if (opno >= MAX_MAX_OPERANDS)
{
- error_with_line (d->lineno, "maximum number of operands exceeded");
+ error_at (d->loc, "maximum number of operands exceeded");
  return;
}
   if (d->operand[opno].seen)
-   error_with_line (d->lineno, "repeated operand number %d\n", opno);
+   error_at (d->loc, "repeated operand number %d\n", opno);
 
   d->operand[opno].seen = 1;
   d->operand[opno].mode = GET_MODE (part);
@@ -492,11 +488,11 @@ scan_operands (struct data *d, rtx part,
   opno = XINT (part, 0);
   if (opno >= MAX_MAX_OPERANDS)
{
- error_with_line (d->lineno, "maximum number of operands exceeded");
+ error_at (d->loc, "maximum number of operands exceeded");
  return;
}
   if (d->operand[opno].seen)
-   error_with_line (d->lineno, "repeated operand number %d\n", opno);
+   error_at (d->loc, "repeated operand number %d\n", opno)

[committed] Use file_location to record positions in genattrtab.c

2015-07-16 Thread Richard Sandiford
This includes replacing calls to fatal() (without a file location)
with calls to a new function fatal_at, which should improve error
reporting.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* read-md.h (fatal_at): Declare.
* read-md.c (fatal_at): New function.
* genattrtab.c (insn_def, attr_desc, delay_desc): Use a file_location
to record the source position.
(check_attr_test): Take a file_location instead of a line number.
Use fatal_at instead of fatal.
(check_attr_value): Update after above changes, using "at"
rather than "with_line" reporting functions.
(convert_set_attr_alternative): Likewise.
(gen_attr): Likewise.
(check_defs): Likewise.  Don't assign to read_md_filename.
(gen_insn): Update initialization after above changes.
(gen_delay): Likewise.
(write_insn_cases): Print the filename for a define_peephole.
(gen_insn_reserv): Take a line number as argument and update
the call to check_attr_test.
(main): Pass a line number to gen_insn_reserv.

Index: gcc/read-md.h
===
--- gcc/read-md.h   2015-07-06 21:58:18.110670200 +0100
+++ gcc/read-md.h   2015-07-06 21:58:36.117085875 +0100
@@ -136,6 +136,7 @@ extern void print_c_condition (const cha
 extern void fprint_c_condition (FILE *, const char *);
 extern void message_at (file_location, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void error_at (file_location, const char *, ...) ATTRIBUTE_PRINTF_2;
+extern void fatal_at (file_location, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void message_with_line (int, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void error_with_line (int, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void fatal_with_file_and_line (const char *, ...)
Index: gcc/read-md.c
===
--- gcc/read-md.c   2015-07-06 21:58:18.110670200 +0100
+++ gcc/read-md.c   2015-07-06 21:58:36.117085875 +0100
@@ -277,6 +277,19 @@ error_at (file_location loc, const char
   have_error = 1;
 }
 
+/* Like message_at, but treat the condition as a fatal error.  */
+
+void
+fatal_at (file_location loc, const char *msg, ...)
+{
+  va_list ap;
+
+  va_start (ap, msg);
+  message_at_1 (loc, msg, ap);
+  va_end (ap);
+  exit (1);
+}
+
 /* A printf-like function for reporting an error against line LINENO
in the current MD file.  */
 
Index: gcc/genattrtab.c
===
--- gcc/genattrtab.c2015-07-06 21:58:18.109586912 +0100
+++ gcc/genattrtab.c2015-07-06 21:58:36.116002587 +0100
@@ -139,8 +139,7 @@ struct insn_def
   rtx def; /* The DEFINE_...  */
   int insn_code;   /* Instruction number.  */
   int insn_index;  /* Expression number in file, for errors.  */
-  const char *filename;/* Filename.  */
-  int lineno;  /* Line number.  */
+  file_location loc;   /* Where in the .md files it occurs.  */
   int num_alternatives;/* Number of alternatives.  */
   int vec_idx; /* Index of attribute vector in `def'.  */
 };
@@ -177,7 +176,7 @@ struct attr_desc
   struct attr_desc *next;  /* Next attribute.  */
   struct attr_value *first_value; /* First value of this attribute.  */
   struct attr_value *default_val; /* Default value for this attribute.  */
-  int lineno : 24; /* Line number.  */
+  file_location loc;   /* Where in the .md files it occurs.  */
   unsigned is_numeric  : 1;/* Values of this attribute are numeric.  */
   unsigned is_const: 1;/* Attribute value constant for each run.  */
   unsigned is_special  : 1;/* Don't call `write_attr_set'.  */
@@ -189,8 +188,8 @@ struct delay_desc
 {
   rtx def; /* DEFINE_DELAY expression.  */
   struct delay_desc *next; /* Next DEFINE_DELAY.  */
+  file_location loc;   /* Where in the .md files it occurs.  */
   int num; /* Number of DEFINE_DELAY, starting at 1.  */
-  int lineno;  /* Line number.  */
 };
 
 struct attr_value_list
@@ -746,7 +745,7 @@ attr_copy_rtx (rtx orig)
Return the new expression, if any.  */
 
 static rtx
-check_attr_test (rtx exp, int is_const, int lineno)
+check_attr_test (rtx exp, int is_const, file_location loc)
 {
   struct attr_desc *attr;
   struct attr_value *av;
@@ -761,7 +760,7 @@ check_attr_test (rtx exp, int is_const,
return check_attr_test (attr_rtx (NOT,
  attr_eq (XSTR (exp, 0),
   &XSTR (exp, 1)[1])),
-  

[committed] Use a structure to record file positions in .md files*

2015-07-16 Thread Richard Sandiford
Add a file_location type for recoding the position in an .md file.  Also
add error_at and message_at functions for reporting diagnostics against
a particular position.

read_skip_construct is local to read-md.c so I removed its prototype rather
than update it.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* read-md.h (file_location): New structure.
(directive_handler_t): Take a file_location rather than a line number.
(message_at, error_at): Declare.
(read_skip_construct): Delete.
* read-md.c (message_with_line_1): Replace with...
(message_at_1): ...this new function.
(message_at, error_at): New functions.
(message_with_line, error_with_line): Update to use message_at_1.
(handle_enum): Take a file_location rather than a line number
and use error_at for error reporting.
(handle_include): Likewise.
(read_skip_construct): Likewise.  Make static.
(handle_file): Update after above changes.  Pass a file_location
rather than a line number to handle_directive.
* gensupport.c (queue_elem): Replace separate filename and lineno
with a file_location.
(queue_pattern): Replace filename and lineno arguments with a
file_location.  Update after change to queue_elem.
(process_define_predicate): Replace lineno argument with a
file_location and use error_at for error reporting.  Update
after above changes.
(process_rtx): Likewise.
(subst_pattern_match): Likewise.
(get_alternatives_number): Likewise.
(alter_predicate_for_insn): Likewise.
(rtx_handle_directive): Likewise.
(is_predicable): Update after above changes, using error_at rather
than error_with_line.
(has_subst_attribute): Likewise.
(identify_predicable_attribute): Likewise.
(alter_attrs_for_subst_insn): Likewise.
(process_one_cond_exec): Likewise.
(process_substs_on_one_elem): Likewise.
(process_define_subst): Likewise.
(check_define_attr_duplicates): Likewise.
(read_md_rtx): Update after change to queue_elem.

Index: gcc/read-md.h
===
--- gcc/read-md.h   2015-07-16 14:21:19.492932398 +0100
+++ gcc/read-md.h   2015-07-16 14:21:19.484932490 +0100
@@ -22,6 +22,18 @@ #define GCC_READ_MD_H
 
 #include "obstack.h"
 
+/* Records a position in the file.  */
+struct file_location {
+  file_location () {}
+  file_location (const char *, int);
+
+  const char *filename;
+  int lineno;
+};
+
+inline file_location::file_location (const char *filename_in, int lineno_in)
+  : filename (filename_in), lineno (lineno_in) {}
+
 /* Holds one symbol or number in the .md file.  */
 struct md_name {
   /* The name as it appeared in the .md file.  Names are syntactically
@@ -79,10 +91,10 @@ struct enum_type {
 };
 
 /* A callback that handles a single .md-file directive, up to but not
-   including the closing ')'.  It takes two arguments: the line number on
-   which the directive started, and the name of the directive.  The next
+   including the closing ')'.  It takes two arguments: the file position
+   at which the directive started, and the name of the directive.  The next
unread character is the optional space after the directive name.  */
-typedef void (*directive_handler_t) (int, const char *);
+typedef void (*directive_handler_t) (file_location, const char *);
 
 extern const char *in_fname;
 extern FILE *read_md_file;
@@ -122,6 +134,8 @@ extern void fprint_md_ptr_loc (FILE *, c
 extern const char *join_c_conditions (const char *, const char *);
 extern void print_c_condition (const char *);
 extern void fprint_c_condition (FILE *, const char *);
+extern void message_at (file_location, const char *, ...) ATTRIBUTE_PRINTF_2;
+extern void error_at (file_location, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void message_with_line (int, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void error_with_line (int, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void fatal_with_file_and_line (const char *, ...)
@@ -131,7 +145,6 @@ extern int read_skip_spaces (void);
 extern void read_name (struct md_name *);
 extern char *read_quoted_string (void);
 extern char *read_string (int);
-extern void read_skip_construct (int, int);
 extern int n_comma_elts (const char *);
 extern const char *scan_comma_elt (const char **);
 extern void upcase_string (char *);
Index: gcc/read-md.c
===
--- gcc/read-md.c   2015-07-16 14:21:19.492932398 +0100
+++ gcc/read-md.c   2015-07-16 14:21:19.484932490 +0100
@@ -245,13 +245,38 @@ print_c_condition (const char *cond)
 

[PATCH] Fix PR ipa/66896

2015-07-16 Thread Martin Liška
Hello.

Following patch fixes $subject, which can be spotted on gcc-5-branch, while 
trunk
looks fine (even though it can potentially suffer from the same issues).

Patch can both survive regression tests on trunk and gcc-5-branch on 
x86_64-linux-pc.

Ready for both branches?
Thanks,
Martin
>From d6322e2c665cb45e1d44d9549ac5149ec10a667a Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 16 Jul 2015 14:19:32 +0200
Subject: [PATCH] Fix PR ipa/66896.

gcc/testsuite/ChangeLog:

2015-07-16  Martin Liska  

	* g++.dg/ipa/pr66896.c: New test.

gcc/ChangeLog:

2015-07-16  Martin Liska  

	PR ipa/66896.
	* ipa-prop.c (update_jump_functions_after_inlining): Create properly
	dst_ctx if it does not exist.
---
 gcc/ipa-prop.c | 12 
 gcc/testsuite/g++.dg/ipa/pr66896.C | 22 ++
 2 files changed, 30 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr66896.C

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 34e4826..3415856 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -2377,11 +2377,15 @@ update_jump_functions_after_inlining (struct cgraph_edge *cs,
 	  ctx.offset_by (dst->value.ancestor.offset);
 	  if (!ctx.useless_p ())
 		{
-		  vec_safe_grow_cleared (args->polymorphic_call_contexts,
-	 count);
-		  dst_ctx = ipa_get_ith_polymorhic_call_context (args, i);
+		  if (!dst_ctx)
+		{
+		  vec_safe_grow_cleared (args->polymorphic_call_contexts,
+	 count);
+		  dst_ctx = ipa_get_ith_polymorhic_call_context (args, i);
+		}
+
+		  dst_ctx->combine_with (ctx);
 		}
-	  dst_ctx->combine_with (ctx);
 	}
 
 	  if (src->agg.items
diff --git a/gcc/testsuite/g++.dg/ipa/pr66896.C b/gcc/testsuite/g++.dg/ipa/pr66896.C
new file mode 100644
index 000..236537a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr66896.C
@@ -0,0 +1,22 @@
+// PR ipa/66896
+// { dg-do compile }
+
+void f2 (void *);
+void f3 ();
+
+struct A
+{
+  int *a;
+  A ();
+  ~A () { a3 (); }
+  int a1 (int * p) { if (!p) f3 (); f2 (p); }
+  void a3 () { if (*a) a1 (a); }
+};
+
+struct B : A {~B () { a3 ();}};
+
+struct F {};
+
+struct G : F {B g;};
+
+void foo () {G g;}
-- 
2.4.5



[committed] Remove write-only next_index_number from genoutput.c

2015-07-16 Thread Richard Sandiford
genoutput.c:next_index_number is only used to set a structure field that
itself is never used.  I think this dates from before the recent fad for
including filenames and line numbers in error messages.

Series bootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* genoutput.c (next_index_number): Delete.
(data): Remove index_number.
(gen_insn, gen_peephole, gen_expand, gen_split): Update accordingly.
(main): Remove manipulation of next_index_number.

Index: gcc/genoutput.c
===
--- gcc/genoutput.c 2015-07-16 14:17:16.423685516 +0100
+++ gcc/genoutput.c 2015-07-16 14:17:16.419685729 +0100
@@ -109,11 +109,6 @@ static const char *strip_whitespace(con
 
 static int next_code_number;
 
-/* This counts all definitions in the md file,
-   for the sake of error messages.  */
-
-static int next_index_number;
-
 /* This counts all operands used in the md file.  The first is null.  */
 
 static int next_operand_number = 1;
@@ -160,7 +155,6 @@ struct data
   const char *name;
   const char *template_code;
   int code_number;
-  int index_number;
   const char *filename;
   int lineno;
   int n_generator_args;/* Number of arguments passed to 
generator */
@@ -885,7 +879,6 @@ gen_insn (rtx insn, int lineno)
   int i;
 
   d->code_number = next_code_number;
-  d->index_number = next_index_number;
   d->filename = read_md_filename;
   d->lineno = lineno;
   if (XSTR (insn, 0)[0])
@@ -928,7 +921,6 @@ gen_peephole (rtx peep, int lineno)
   int i;
 
   d->code_number = next_code_number;
-  d->index_number = next_index_number;
   d->filename = read_md_filename;
   d->lineno = lineno;
   d->name = 0;
@@ -968,7 +960,6 @@ gen_expand (rtx insn, int lineno)
   int i;
 
   d->code_number = next_code_number;
-  d->index_number = next_index_number;
   d->filename = read_md_filename;
   d->lineno = lineno;
   if (XSTR (insn, 0)[0])
@@ -1014,7 +1005,6 @@ gen_split (rtx split, int lineno)
   int i;
 
   d->code_number = next_code_number;
-  d->index_number = next_index_number;
   d->filename = read_md_filename;
   d->lineno = lineno;
   d->name = 0;
@@ -1067,7 +1057,6 @@ main (int argc, char **argv)
 return (FATAL_EXIT_CODE);
 
   output_prologue ();
-  next_index_number = 0;
 
   /* Read the machine description.  */
 
@@ -1108,7 +1097,6 @@ main (int argc, char **argv)
default:
  break;
}
-  next_index_number++;
 }
 
   printf ("\n\n");



[committed] Remove unnecessary null checks from genattrtab.c

2015-07-16 Thread Richard Sandiford
check_attr_value and make_canonical have some checks for null attr arguments,
but all callers would segfault if the argument really were null.

This is preparing for some improvements to the source location tracking
in gen*.

Series bBootstrapped & regression-tested on x86_64-linux-gnu.  I built gcc
before and after the series for one target per CPU directory and checked
that the output was the same (except for some filename fixes later in
the series.)  Applied.

Thanks,
Richard


gcc/
* genattrtab.c (check_attr_value): Remove handling of null attrs.
(make_canonical): Likewise.

Index: gcc/genattrtab.c
===
--- gcc/genattrtab.c2015-07-05 22:19:53.407810340 +0100
+++ gcc/genattrtab.c2015-07-06 21:42:39.666899334 +0100
@@ -899,7 +899,7 @@ check_attr_test (rtx exp, int is_const,
 
 /* Given an expression, ensure that it is validly formed and that all named
attribute values are valid for the given attribute.  Issue a fatal error
-   if not.  If no attribute is specified, assume a numeric attribute.
+   if not.
 
Return a perhaps modified replacement expression for the value.  */
 
@@ -913,7 +913,7 @@ check_attr_value (rtx exp, struct attr_d
   switch (GET_CODE (exp))
 {
 case CONST_INT:
-  if (attr && ! attr->is_numeric)
+  if (!attr->is_numeric)
{
  error_with_line (attr->lineno,
   "CONST_INT not valid for non-numeric attribute %s",
@@ -934,15 +934,15 @@ check_attr_value (rtx exp, struct attr_d
   if (! strcmp (XSTR (exp, 0), "*"))
break;
 
-  if (attr == 0 || attr->is_numeric)
+  if (attr->is_numeric)
{
  p = XSTR (exp, 0);
  for (; *p; p++)
if (! ISDIGIT (*p))
  {
-   error_with_line (attr ? attr->lineno : 0,
+   error_with_line (attr->lineno,
 "non-numeric value for numeric attribute %s",
-attr ? attr->name : "internal");
+attr->name);
break;
  }
  break;
@@ -956,13 +956,12 @@ check_attr_value (rtx exp, struct attr_d
   if (av == NULL)
error_with_line (attr->lineno,
 "unknown value `%s' for `%s' attribute",
-XSTR (exp, 0), attr ? attr->name : "internal");
+XSTR (exp, 0), attr->name);
   break;
 
 case IF_THEN_ELSE:
-  XEXP (exp, 0) = check_attr_test (XEXP (exp, 0),
-  attr ? attr->is_const : 0,
-  attr ? attr->lineno : 0);
+  XEXP (exp, 0) = check_attr_test (XEXP (exp, 0), attr->is_const,
+  attr->lineno);
   XEXP (exp, 1) = check_attr_value (XEXP (exp, 1), attr);
   XEXP (exp, 2) = check_attr_value (XEXP (exp, 2), attr);
   break;
@@ -972,7 +971,7 @@ check_attr_value (rtx exp, struct attr_d
 case MULT:
 case DIV:
 case MOD:
-  if (attr && !attr->is_numeric)
+  if (!attr->is_numeric)
{
  error_with_line (attr->lineno,
   "invalid operation `%s' for non-numeric"
@@ -1007,8 +1006,8 @@ check_attr_value (rtx exp, struct attr_d
   for (i = 0; i < XVECLEN (exp, 0); i += 2)
{
  XVECEXP (exp, 0, i) = check_attr_test (XVECEXP (exp, 0, i),
-attr ? attr->is_const : 0,
-attr ? attr->lineno : 0);
+attr->is_const,
+attr->lineno);
  XVECEXP (exp, 0, i + 1)
= check_attr_value (XVECEXP (exp, 0, i + 1), attr);
}
@@ -1020,15 +1019,13 @@ check_attr_value (rtx exp, struct attr_d
   {
struct attr_desc *attr2 = find_attr (&XSTR (exp, 0), 0);
if (attr2 == NULL)
- error_with_line (attr ? attr->lineno : 0,
-  "unknown attribute `%s' in ATTR",
+ error_with_line (attr->lineno, "unknown attribute `%s' in ATTR",
   XSTR (exp, 0));
-   else if (attr && attr->is_const && ! attr2->is_const)
+   else if (attr->is_const && ! attr2->is_const)
  error_with_line (attr->lineno,
   "non-constant attribute `%s' referenced from `%s'",
   XSTR (exp, 0), attr->name);
-   else if (attr
-&& attr->is_numeric != attr2->is_numeric)
+   else if (attr->is_numeric != attr2->is_numeric)
  error_with_line (attr->lineno,
   "numeric attribute mismatch calling `%s' from `%s'",
   XSTR (exp, 0), attr->name);
@@ -1042,7 +1039,7 @@ check_attr_value (rtx exp, struct attr_d
   return attr_rtx (SYMBOL_REF, XSTR (exp, 0));
 
 default:
-  error_with_line 

[SPARC] Use adjust_address instead of adjust_address_nv in sparc.md

2015-07-16 Thread Eric Botcazou
3 patterns related to setjmp/longjmp/non-local gotos call adjust_address_nv 
and this can break (generate unrecognizable insns) when things get serious, 
for example if you configure with --enable-sjlj-exceptions.

Tested on SPARC/Solaris, applied on the mainline.


2015-07-16  Eric Botcazou  

* config/sparc/sparc.md (save_stack_nonlocal): Use adjust_address
instead of adjust_address_nv.
(restore_stack_nonlocal): Likewise.
(nonlocal_goto): Likewise.

-- 
Eric BotcazouIndex: config/sparc/sparc.md
===
--- config/sparc/sparc.md	(revision 225727)
+++ config/sparc/sparc.md	(working copy)
@@ -6704,8 +6704,8 @@ (define_expand "save_stack_nonlocal"
(set (match_dup 2) (match_dup 3))]
   ""
 {
-  operands[0] = adjust_address_nv (operands[0], Pmode, 0);
-  operands[2] = adjust_address_nv (operands[0], Pmode, GET_MODE_SIZE (Pmode));
+  operands[0] = adjust_address (operands[0], Pmode, 0);
+  operands[2] = adjust_address (operands[0], Pmode, GET_MODE_SIZE (Pmode));
   operands[3] = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
 })
 
@@ -6714,7 +6714,7 @@ (define_expand "restore_stack_nonlocal"
 	(match_operand 1 "memory_operand" ""))]
   ""
 {
-  operands[1] = adjust_address_nv (operands[1], Pmode, 0);
+  operands[1] = adjust_address (operands[1], Pmode, 0);
 })
 
 (define_expand "nonlocal_goto"
@@ -6726,9 +6726,9 @@ (define_expand "nonlocal_goto"
 {
   rtx i7 = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
   rtx r_label = copy_to_reg (operands[1]);
-  rtx r_sp = adjust_address_nv (operands[2], Pmode, 0);
+  rtx r_sp = adjust_address (operands[2], Pmode, 0);
   rtx r_fp = operands[3];
-  rtx r_i7 = adjust_address_nv (operands[2], Pmode, GET_MODE_SIZE (Pmode));
+  rtx r_i7 = adjust_address (operands[2], Pmode, GET_MODE_SIZE (Pmode));
 
   /* We need to flush all the register windows so that their contents will
  be re-synchronized by the restore insn of the target function.  */


Re: Constify host-side offload data`

2015-07-16 Thread Nathan Sidwell

On 07/16/15 07:41, Ilya Verbin wrote:

On Wed, Jul 15, 2015 at 20:56:50 -0400, Nathan Sidwell wrote:

Index: gcc/config/nvptx/mkoffload.c
===
-  fprintf (out, "extern void *__OFFLOAD_TABLE__[];\n\n");
+  fprintf (out, "extern const void *conat __OFFLOAD_TABLE__[];\n\n");


Here is a typo.


Thanks, caught that myself too.  testing shows the patch ok for x86-linux/ptx

nathan


Re: [PATCH][4/n] Remove GENERIC stmt combining from SCCVN

2015-07-16 Thread Andrew MacLeod

On 07/16/2015 03:27 AM, Richard Biener wrote:

On Wed, 15 Jul 2015, Andrew MacLeod wrote:


admittedly neither situation is very common I suspect, but it does seem like a
hidden gotchya waiting to happen.

I guess we either want to checking-assert that we never hit that
special marker or handle it appropriately.  Or even better avoid
it in the first place (not sure why we have it - I suppose to allow
modifying immediate uses of the current stmt from inside
FOR_EACH_IMM_USE_STMT).

For me single_imm_use_1 crashed on the NULL USE_STMT at

 if (!is_gimple_debug (USE_STMT (ptr)))

so I presume all was fine until debug stmts were introduced
(well, fine as in not crashing, not as in giving correct answers).


yes, It was probably still wrong, we just erred reporting that a real 
single_use statemen't wasn't.


 The marker is unique in that the STMT field is NULL, which can't 
happen otherwise.


I'll think about how to efficiently get this right

Andrew




Re: Constify host-side offload data`

2015-07-16 Thread Ilya Verbin
On Wed, Jul 15, 2015 at 20:56:50 -0400, Nathan Sidwell wrote:
> Index: gcc/config/nvptx/mkoffload.c
> ===
> -  fprintf (out, "extern void *__OFFLOAD_TABLE__[];\n\n");
> +  fprintf (out, "extern const void *conat __OFFLOAD_TABLE__[];\n\n");

Here is a typo.

  -- Ilya


[PATCH] Fix PR66894

2015-07-16 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-07-16  Richard Biener  

PR tree-optimization/66894
* tree-vrp.c (register_edge_assert_for_2): Fix bad assumption
about deriving NE_EXPR from truncated values.

* gcc.dg/torture/pr66894.c: New testcase.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 225860)
+++ gcc/tree-vrp.c  (working copy)
@@ -5382,13 +5382,11 @@ register_edge_assert_for_2 (tree name, e
}
  else if (CONVERT_EXPR_CODE_P (code))
{
- /* For truncating conversions require that the constant
-fits in the truncated type if we are going to record
+ /* For truncating conversions we cannot record
 an inequality.  */
  if (comp_code == NE_EXPR
  && (TYPE_PRECISION (TREE_TYPE (name2))
- < TYPE_PRECISION (TREE_TYPE (name)))
- && ! int_fits_type_p (val, TREE_TYPE (name2)))
+ < TYPE_PRECISION (TREE_TYPE (name
continue;
  cst = fold_convert (TREE_TYPE (name2), val);
}
Index: gcc/testsuite/gcc.dg/torture/pr66894.c
===
--- gcc/testsuite/gcc.dg/torture/pr66894.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr66894.c  (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+
+short a, b;
+
+int
+main ()
+{
+  for (; a != 1; a += 3)
+{
+  int c = 0;
+  for (; c < 2; c++)
+   if (a)
+ {
+   char d = a;
+   b = d ? 1 / d : 0; 
+ }
+   else
+ break;
+}
+  return 0;
+}


Re: [RFC, PR66873] Use graphite for parloops

2015-07-16 Thread Tom de Vries

On 16/07/15 12:23, Richard Biener wrote:

On Thu, Jul 16, 2015 at 12:19 PM, Thomas Schwinge
 wrote:

Hi Tom!

On Thu, 16 Jul 2015 10:46:00 +0200, Richard Biener  
wrote:

On Wed, Jul 15, 2015 at 10:26 PM, Tom de Vries  wrote:

I tried to parallelize this fortran test-case (based on autopar/outer-1.c),
[...]



So I wondered, why not always use the graphite dependency analysis in
parloops. (Of course you could use -floop-parallelize-all, but that also
changes the heuristic). So I wrote a patch for parloops to use graphite
dependency analysis by default (so without -floop-parallelize-all), but
while testing found out that all the reduction test-cases started failing
because the modifications graphite makes to the code messes up the parloops
reduction analysis.

Then I came up with this patch, which:
- first runs a parloops pass, restricted to reduction loops only,
- then runs graphite dependency analysis
- followed by a normal parloops pass run.

This way, we get to both:
- compile the reduction testcases as before, and
- profit from the better graphite dependency analysis otherwise.



graphite dependence analysis is too slow to be enabled unconditionally.
(read: hours in some simple cases - see bugzilla)


Haha, "cool"!  ;-)

Maybe it is still reasonable to use graphite to analyze the code inside
OpenACC kernels regions -- maybe such code can reasonably be expected to
not have the properties that make its analysis lengthy?  So, Tom, could
you please identify and check such PRs, to get an understanding of what
these properties are?


Like the one in PR62113 or 53852 or 59121.


PR62113 and PR59121 do not reproduce for me on trunk.

PR53852 does reproduce for me (to the point that I had to reset my laptop).

Thanks,
- Tom


Re: [PATCH 1/3] tree-ssa-tail-merge: add IPA ICF infrastructure.

2015-07-16 Thread Martin Liška
On 07/09/2015 06:24 PM, Jeff Law wrote:
> On 07/09/2015 07:56 AM, mliska wrote:
>> gcc/ChangeLog:
>>
>> 2015-07-09  Martin Liska  
>>
>> * dbgcnt.def: Add new debug counter.
>> * ipa-icf-gimple.c (func_checker::compare_ssa_name): Add flag
>> for strict mode.
>> (func_checker::compare_memory_operand): Likewise.
>> (func_checker::compare_cst_or_decl): Handle if we are in
>> tail_merge_mode.
>> (func_checker::compare_operand): Pass strict flag properly.
>> (func_checker::stmt_local_def): New function.
>> (func_checker::compare_phi_node): Move from sem_function class.
>> (func_checker::compare_bb_tail_merge): New function.
>> (func_checker::compare_bb): Improve STMT iteration.
>> (func_checker::compare_gimple_call): Pass strict flag.
>> (func_checker::compare_gimple_assign): Likewise.
>> (func_checker::compare_gimple_label): Remove unused flag.
>> (ssa_names_set): New class.
>> (ssa_names_set::build): New function.
>> * ipa-icf-gimple.h (func_checker::gsi_next_nonlocal): New
>> function.
>> (ssa_names_set::contains): New function.
>> (ssa_names_set::add): Likewise.
>> * ipa-icf.c (sem_function::equals_private): Use transformed
>> function.
>> (sem_function::compare_phi_node): Move to func_checker class.
>> * ipa-icf.h: Add new declarations.
>> * tree-ssa-tail-merge.c (check_edges_correspondence): New
>> function.
>> (find_duplicate): Add usage of IPA ICF gimple infrastructure.
>> (find_clusters_1): Pass new sem_function argument.
>> (find_clusters): Likewise.
>> (tail_merge_optimize): Call IPA ICF comparison machinery.
> So a general question.  We're passing in STRICT to several routines, which is 
> fine.  But then we're also checking M_TAIL_MERGE_MODE.  What's the difference 
> between the two?  Can they be unified?

Hello.

I would say that STRICT is a bit generic mechanism that was introduced some 
time before. It's e.g. used for checking of THIS arguments for methods and make 
checking
more sensitive in situations that are somehow special.

The newly added state is orthogonal to the previous one.

> 
> 
>>
>> -/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.  */
>> +/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.
>> +   If STRICT flag is true, versions must match strictly.  */
>>
>>   bool
>> -func_checker::compare_ssa_name (tree t1, tree t2)
>> +func_checker::compare_ssa_name (tree t1, tree t2, bool strict)
> This (and other) functions would seem to be used more than just ICF at this 
> point.  A pass over the comments to update them as appropriate would be 
> appreciated.
> 
>> @@ -626,6 +648,136 @@ func_checker::parse_labels (sem_bb *bb)
>>   }
>>   }
>>
>> +/* Return true if gimple STMT is just a local difinition in a
>> +   basic block.  Used SSA names are contained in SSA_NAMES_SET.  */
> s/difinition/definition/

Thanks.

> 
> I didn't find this comment particularly useful in understanding what this 
> function does.  AFAICT the function looks as the uses of the LHS of STMT and 
> verifies they're all in the same block as STMT, right?
> 
> It also verifies that the none of the operands within STMT are part of 
> SSA_NAMES_SET.
> 
> What role do those properties play in the meaning of "local definition"?

I tried to explain it more deeply what's the purpose of this function.

> 
> 
> 
> 
>> @@ -1037,4 +1205,60 @@ func_checker::compare_gimple_asm (const gasm *g1, 
>> const gasm *g2)
>> return true;
>>   }
>>
>> +void
>> +ssa_names_set::build (basic_block bb)
> Needs a function comment.  What are the "important" names we're collecting 
> here?
> 
> Is a single forward and backward pass really sufficient to find all the 
> important names?
> 
> In the backward pass, do you have to consider things like ASMs?  I guess it's 
> difficult to understand what you need to look at because it's not entirely 
> clear the set of SSA_NAMEs you're building.
> 
> 
> 
>> @@ -149,12 +153,20 @@ public:
>>mapping between basic blocks and labels.  */
>> void parse_labels (sem_bb *bb);
>>
>> +  /* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
>> + true value is returned if phi nodes are semantically
>> + equivalent in these blocks.  */
>> +  bool compare_phi_node (sem_bb *sem_bb1, sem_bb *sem_bb2);
> Presumably in the case of tail merging, FUNC1 and FUNC will be the same :-)

Yes, the function is not called from tail-merge pass.

> 
> 
>> /* Verifies that trees T1 and T2 are equivalent from perspective of ICF. 
>>  */
>> -  bool compare_ssa_name (tree t1, tree t2);
>> +  bool compare_ssa_name (tree t1, tree t2, bool strict = true);
>>
>> /* Verification function for edges E1 and E2.  */
>> bool compare_edge (edge e1, edge e2);
>> @@ -204,7 +216,7 @@ public:
>> bool compare_tree_ssa_label (tree t1, tree t2);
>>
>> /* Function compare for equality given memory operands T1 and T2.  */
>> -  

  1   2   >