date:20151112

Re: gcc-6/changes.html : Document AMD znver1

2015-11-12 Thread Gerald Pfeifer

On Thu, 12 Nov 2015, Stepanyan, Victoria wrote:
> This patch adds znver1 description in changes.html:
:
> Ok for trunk?

Thanks, yes, this looks good to me.

Gerald

Re: [PATCH] More compile-time saving in BB vectorization

2015-11-12 Thread Christophe Lyon

On 12 November 2015 at 16:49, Andreas Schwab  wrote:
> Richard Biener  writes:
>
>>   * tree-vectorizer.h (vect_slp_analyze_and_verify_instance_alignment):
>>   Declare.
>>   (vect_analyze_data_refs_alignment): Make loop vect specific.
>>   (vect_verify_datarefs_alignment): Likewise.
>>   * tree-vect-data-refs.c (vect_slp_analyze_data_ref_dependences):
>>   Add missing continue.
>>   (vect_compute_data_ref_alignment): Export.
>>   (vect_compute_data_refs_alignment): Merge into...
>>   (vect_analyze_data_refs_alignment): ... this.
>>   (verify_data_ref_alignment): Split out from ...
>>   (vect_verify_datarefs_alignment): ... here.
>>   (vect_slp_analyze_and_verify_node_alignment): New function.
>>   (vect_slp_analyze_and_verify_instance_alignment): Likewise.
>>   * tree-vect-slp.c (vect_supported_load_permutation_p): Remove
>>   misplaced checks on alignment.
>>   (vect_slp_analyze_bb_1): Add fatal output parameter.  Do
>>   alignment analysis after SLP discovery and do it per instance.
>>   (vect_slp_bb): When vect_slp_analyze_bb_1 fatally failed do not
>>   bother to re-try using different vector sizes.
>
> This breaks libgfortran on ia64:
>
> ../../../libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
> ../../../libgfortran/generated/matmul_c4.c:79:1: internal compiler error: in 
> vectorizable_store, at tree-vect-stmts.c:5651
>  matmul_c4 (gfc_array_c4 * const restrict retarray,
>  ^
> 0x410ff01f vectorizable_store
> ../../gcc/tree-vect-stmts.c:5651
> 0x41115b5f vect_transform_stmt(gimple*, gimple_stmt_iterator*, bool*, 
> _slp_tree*, _slp_instance*)
> ../../gcc/tree-vect-stmts.c:8003
> 0x4114df1f vect_schedule_slp_instance
> ../../gcc/tree-vect-slp.c:3484
> 0x41154d6f vect_schedule_slp(vec_info*)
> ../../gcc/tree-vect-slp.c:3549
> 0x411562bf vect_slp_bb(basic_block_def*)
> ../../gcc/tree-vect-slp.c:2543
> 0x41159f2f execute
> ../../gcc/tree-vectorizer.c:734
>

Same problem on armeb.


> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

Re: [PATCH], Add power9 support to GCC, patch #8 (add integer multiply/add)

2015-11-12 Thread David Edelsohn

On Tue, Nov 10, 2015 at 1:39 PM, Michael Meissner
 wrote:
> This patch adds support for the MADDLD instruciton, which is a fused
> multiply/add instruction for integers.  At this time, it is for 64-bit
> multiplies only.  Eventually, we will restructure 128-bit multiply so that we
> can use the 64x64 + 64 high bit varients.
>
> I have bootstrapped a compiler with this change in and there were no
> regressions.  Is it ok to apply to the trunk?
>
> [gcc]
> 2015-11-10  Michael Meissner  
>
> * config/rs6000/rs6000.h (TARGET_MADDLD): Add support for the ISA
> 3.0 integer multiply-add instruction.
> * config/rs6000/rs6000.md (mul3): Likewise.
>
> [gcc/testsuite]
> 2015-11-10  Michael Meissner  
>
> * gcc.target/powerpc/maddld.c: New test.

Okay.

Thanks, David

Re: [gomp4.5] depend nowait support for target

2015-11-12 Thread Ilya Verbin

On Thu, Nov 12, 2015 at 18:45:09 +0100, Jakub Jelinek wrote:
> But the testcase I wrote (target-33.c) hangs, the problem is in the
>   #pragma omp target nowait map (tofrom: a, b) depend(out: d[3])
>   {
> #pragma omp atomic update
> a = a + 9;
> b -= 8;
>   }
>   #pragma omp target nowait map (tofrom: a, c) depend(out: d[4])
>   {
> #pragma omp atomic update
> a = a + 4;
> c >>= 1;
>   }
>   #pragma omp task if (0) depend (in: d[3], d[4])
>   if (a != 50 || b != 4 || c != 20)
> abort ();
> part, where (I should change that for the case of no dependencies
> eventually) the task with map_vars+async_run is queued in both cases,
> then we reach GOMP_task, which calls gomp_task_maybe_wait_for_dependencies
> which spawns the first half task (map_vars+async_run), and then
> the second half task (map_vars+async_run), but that one gets stuck somewhere
> in liboffloadmic, then some other thread (from liboffloadmic) calls
> GOMP_PLUGIN_target_task_completion and enqueues the second half of the first
> target task (unmap_vars), but as the only normal thread in the main program
> is stuck in liboffloadmic (during gomp_map_vars, trying to allocate
> target memory in the plugin), there is no thread to schedule the second half
> of first target task.  So, if liboffloadmic is stuck waiting for unmap_vars,
> it is a deadlock.  Can you please try to debug this?

I'm unable to reproduce the hang (have tried various values of OMP_NUM_THREADS).
The testcase just aborts at (a != 50 || b != 4 || c != 20), because
a == 37, b == 12, c == 40.

BTW, don't know is this a bug or not:
Conditional jump or move depends on uninitialised value(s)
   at 0x4C2083D: priority_queue_insert (priority_queue.h:347)
   by 0x4C24DF9: GOMP_PLUGIN_target_task_completion (task.c:678)

  -- Ilya

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2015-11-12 Thread Jeff Law


On 11/12/2015 12:40 PM, Richard Biener wrote:

On November 12, 2015 7:32:57 PM GMT+01:00, Jeff Law 
wrote:

On 11/12/2015 10:05 AM, Jeff Law wrote:

But IIRC you mentioned it should enable vectorization or so?
In

this

case that's obviously too late.

The opposite.  Path splitting interferes with if-conversion &
vectorization.  Path splitting mucks up the CFG enough that
if-conversion won't fire and as a result vectorization is
inhibited.

It

also creates multi-latch loops, which isn't a great situation
either.

It *may* be the case that dropping it that far down in the
pipeline

and

making the modifications necessary to handle simple latches may
in

turn

make the path splitting code play better with if-conversion and
vectorization and avoid creation of multi-latch loops.  At least

that's

how it looks on paper when I draw out the CFG manipulations.

I'll do some experiments.

It doesn't look too terrible to ravamp the recognition code to
work later in the pipeline with simple latches.  Sadly that doesn't
seem to have fixed the bad interactions with if-conversion.

*But* that does open up the possibility of moving the path
splitting pass even deeper in the pipeline -- in particular we can
move it past the vectorizer.  Which is may be a win.

So the big question is whether or not we'll still see enough
benefits from having it so late in the pipeline.  It's still early
enough that we get DOM, VRP, reassoc, forwprop, phiopt, etc.

Ajit, I'll pass along an updated patch after doing some more
testing.


BTW, if you not use loops_normal for loop init you don't get simple
latches forced (and cfg-cleanup will remove them)
I think I'd prefer to have loops start in simple-latches form and 
preserve the simple-latches form.Detection is slightly harder, but 
transformation without creating multiple latches is easier.


jeff

tm_p.h for bb-reorder.c?

2015-11-12 Thread Mike Stump

My port needs the below patch.  I think this was reduced by someone on a port 
that didn’t use some features (TARGET_SHORT_BRANCH_CHEAPER) of tm.h. 

So, the question is, is this the preferred way to do this?  I don’t want to 
hookize TARGET_SHORT_BRANCH_CHEAPER, which is the other fix.

If yes, Ok?

diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 3dcb82a..7a2b351 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -114,6 +114,7 @@
 #include "bb-reorder.h"
 #include "except.h"
 #include "fibonacci_heap.h"
+#include "tm_p.h"
 
 /* The number of rounds.  In most cases there will only be 4 rounds, but
when partitioning hot and cold basic blocks into separate sections of

Re: tm_p.h for bb-reorder.c?

2015-11-12 Thread Jeff Law


On 11/12/2015 12:46 PM, Mike Stump wrote:

My port needs the below patch.  I think this was reduced by someone on a port 
that didn’t use some features (TARGET_SHORT_BRANCH_CHEAPER) of tm.h.

So, the question is, is this the preferred way to do this?  I don’t want to 
hookize TARGET_SHORT_BRANCH_CHEAPER, which is the other fix.

If yes, Ok?
I'd prefer not.  We will likely run the reducers regularly and each time 
someone is going to look at this and have to remember that it's dealing 
with a need that is specific to an out-of-tree port.


I'd recommend hookizing.


Jeff

Re: [PATCH 02/02] C FE: add fix-it hint for . vs ->

2015-11-12 Thread Joseph Myers

On Thu, 12 Nov 2015, David Malcolm wrote:

> On Tue, 2015-11-10 at 17:55 +, Joseph Myers wrote:
> > On Tue, 10 Nov 2015, David Malcolm wrote:
> > 
> > > This is the most trivial example of a real fix-it example I could think
> > > of: if the user writes
> > >   ptr.field
> > > rather than ptr->field.
> > > 
> > > gcc/c/ChangeLog:
> > >   * c-typeck.c (build_component_ref): Special-case POINTER_TYPE when
> > >   generating a "not a structure of union"  error message, and
> > >   suggest a "->" rather than a ".", providing a fix-it hint.
> > 
> > I wonder if this should be restricted to the case where the pointer's 
> > target is of structure or union type.  
> 
> Probably.  Attached is an updated version of the patch that does so.

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[rs6000] Rotate stack checking loop

2015-11-12 Thread Eric Botcazou

Hi,

this patch rotates the loop generated in the prologue to do stack checking 
when -fstack-check is specified, thereby saving one branch instruction.  It 
was initially implemented as a WHILE loop to match the generic implementation 
but can be turned into a DO-WHILE loop because the amount of stack to be 
checked is known at compile time (since it's the static part of the frame).

Tested on PowerPC/Linux, OK for the mainline?


2015-11-12  Eric Botcazou  

* config/rs6000/rs6000.c (rs6000_emit_probe_stack_rang): Adjust.
(output_probe_stack_range): Rotate the loop and simplify.

-- 
Eric BotcazouIndex: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 230204)
+++ config/rs6000/rs6000.c	(working copy)
@@ -23988,11 +23988,12 @@ rs6000_emit_probe_stack_range (HOST_WIDE
 
   /* Step 3: the loop
 
-	 while (TEST_ADDR != LAST_ADDR)
+	 do
 	   {
 	 TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
 	 probe at TEST_ADDR
 	   }
+	 while (TEST_ADDR != LAST_ADDR)
 
 	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
 	 until it is equal to ROUNDED_SIZE.  */
@@ -24018,39 +24019,35 @@ const char *
 output_probe_stack_range (rtx reg1, rtx reg2)
 {
   static int labelno = 0;
-  char loop_lab[32], end_lab[32];
+  char loop_lab[32];
   rtx xops[2];
 
-  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
-  ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
+  /* Loop.  */
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
 
-  /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
   xops[0] = reg1;
+  xops[1] = GEN_INT (-PROBE_INTERVAL);
+  output_asm_insn ("addi %0,%0,%1", xops);
+
+  /* Probe at TEST_ADDR.  */
+  xops[1] = gen_rtx_REG (Pmode, 0);
+  output_asm_insn ("stw %1,0(%0)", xops);
+
+  /* Test if TEST_ADDR == LAST_ADDR.  */
   xops[1] = reg2;
   if (TARGET_64BIT)
 output_asm_insn ("cmpd 0,%0,%1", xops);
   else
 output_asm_insn ("cmpw 0,%0,%1", xops);
 
-  fputs ("\tbeq 0,", asm_out_file);
-  assemble_name_raw (asm_out_file, end_lab);
-  fputc ('\n', asm_out_file);
-
-  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
-  xops[1] = GEN_INT (-PROBE_INTERVAL);
-  output_asm_insn ("addi %0,%0,%1", xops);
-
-  /* Probe at TEST_ADDR and branch.  */
-  xops[1] = gen_rtx_REG (Pmode, 0);
-  output_asm_insn ("stw %1,0(%0)", xops);
-  fprintf (asm_out_file, "\tb ");
+  /* Branch.  */
+  fputs ("\tbne 0,", asm_out_file);
   assemble_name_raw (asm_out_file, loop_lab);
   fputc ('\n', asm_out_file);
 
-  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, end_lab);
-
   return "";
 }

[mips] Rotate stack checking loop

2015-11-12 Thread Eric Botcazou

Hi,

this patch rotates the loop generated in the prologue to do stack checking 
when -fstack-check is specified, thereby saving one branch instruction.  It 
was initially implemented as a WHILE loop to match the generic implementation 
but can be turned into a DO-WHILE loop because the amount of stack to be 
checked is known at compile time (since it's the static part of the frame).

Unfortunately I don't have access to MIPS hardware any more so I only verified 
that the assembly code is as expected and can be assembled.   OK for mainline?


2015-11-12  Eric Botcazou  

* config/mips/mips.c (mips_emit_probe_stack_range): Adjust.
(mips_output_probe_stack_range): Rotate the loop and simplify.

-- 
Eric BotcazouIndex: config/mips/mips.c
===
--- config/mips/mips.c	(revision 230204)
+++ config/mips/mips.c	(working copy)
@@ -11335,11 +11335,12 @@ mips_emit_probe_stack_range (HOST_WIDE_I
 
   /* Step 3: the loop
 
-	while (TEST_ADDR != LAST_ADDR)
+	do
 	  {
 	TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
 	probe at TEST_ADDR
 	  }
+	while (TEST_ADDR != LAST_ADDR)
 
 	probes at FIRST + N * PROBE_INTERVAL for values of N from 1
 	until it is equal to ROUNDED_SIZE.  */
@@ -11365,38 +11366,31 @@ const char *
 mips_output_probe_stack_range (rtx reg1, rtx reg2)
 {
   static int labelno = 0;
-  char loop_lab[32], end_lab[32], tmp[64];
+  char loop_lab[32], tmp[64];
   rtx xops[2];
 
-  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
-  ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
+  /* Loop.  */
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
 
-  /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
-  xops[0] = reg1;
-  xops[1] = reg2;
-  strcpy (tmp, "%(%

[sparc] Rotate stack checking loop

2015-11-12 Thread Eric Botcazou

Hi,

this patch rotates the loop generated in the prologue to do stack checking 
when -fstack-check is specified, thereby saving one branch instruction.  It 
was initially implemented as a WHILE loop to match the generic implementation 
but can be turned into a DO-WHILE loop because the amount of stack to be 
checked is known at compile time (since it's the static part of the frame).

Tested on SPARC/Solaris, to be applied on the mainline.


2015-11-12  Eric Botcazou  

* config/sparc/sparc.c (sparc_emit_probe_stack_range): Adjust.
(output_probe_stack_range): Rotate the loop and simplify.

-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 230204)
+++ config/sparc/sparc.c	(working copy)
@@ -5058,9 +5058,9 @@ sparc_emit_probe_stack_range (HOST_WIDE_
   emit_stack_probe (plus_constant (Pmode, g1, -size));
 }
 
-  /* The run-time loop is made up of 10 insns in the generic case while the
+  /* The run-time loop is made up of 9 insns in the generic case while the
  compile-time loop is made up of 4+2*(n-2) insns for n # of intervals.  */
-  else if (size <= 5 * PROBE_INTERVAL)
+  else if (size <= 4 * PROBE_INTERVAL)
 {
   HOST_WIDE_INT i;
 
@@ -5147,41 +5147,33 @@ const char *
 output_probe_stack_range (rtx reg1, rtx reg2)
 {
   static int labelno = 0;
-  char loop_lab[32], end_lab[32];
+  char loop_lab[32];
   rtx xops[2];
 
-  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
-  ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
+  /* Loop.  */
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
 
-   /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
   xops[0] = reg1;
+  xops[1] = GEN_INT (-PROBE_INTERVAL);
+  output_asm_insn ("add\t%0, %1, %0", xops);
+
+  /* Test if TEST_ADDR == LAST_ADDR.  */
   xops[1] = reg2;
   output_asm_insn ("cmp\t%0, %1", xops);
-  if (TARGET_ARCH64)
-fputs ("\tbe,pn\t%xcc,", asm_out_file);
-  else
-fputs ("\tbe\t", asm_out_file);
-  assemble_name_raw (asm_out_file, end_lab);
-  fputc ('\n', asm_out_file);
-
-  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
-  xops[1] = GEN_INT (-PROBE_INTERVAL);
-  output_asm_insn (" add\t%0, %1, %0", xops);
 
   /* Probe at TEST_ADDR and branch.  */
   if (TARGET_ARCH64)
-fputs ("\tba,pt\t%xcc,", asm_out_file);
+fputs ("\tbne,pt\t%xcc,", asm_out_file);
   else
-fputs ("\tba\t", asm_out_file);
+fputs ("\tbne\t", asm_out_file);
   assemble_name_raw (asm_out_file, loop_lab);
   fputc ('\n', asm_out_file);
   xops[1] = GEN_INT (SPARC_STACK_BIAS);
   output_asm_insn (" st\t%%g0, [%0+%1]", xops);
 
-  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, end_lab);
-
   return "";
 }

[ia64] Rotate stack checking loop

2015-11-12 Thread Eric Botcazou

Hi,

this patch rotates the loop generated in the prologue to do stack checking 
when -fstack-check is specified, thereby saving one branch instruction.  It 
was initially implemented as a WHILE loop to match the generic implementation 
but can be turned into a DO-WHILE loop because the amount of stack to be 
checked is known at compile time (since it's the static part of the frame).

The patch also fixes an error in the instruction count for the loop.

Tested on IA-64/Linux, OK for the mainline?


2015-11-12  Eric Botcazou  

* config/ia64/ia64.c (ia64_emit_probe_stack_range): Adjust.
(output_probe_stack_range): Rotate the loop and simplify.

-- 
Eric BotcazouIndex: config/ia64/ia64.c
===
--- config/ia64/ia64.c	(revision 230204)
+++ config/ia64/ia64.c	(working copy)
@@ -3293,7 +3293,7 @@ ia64_emit_probe_stack_range (HOST_WIDE_I
   else if (size <= PROBE_INTERVAL)
 emit_stack_probe (r2);
 
-  /* The run-time loop is made up of 8 insns in the generic case while this
+  /* The run-time loop is made up of 9 insns in the generic case while this
  compile-time loop is made up of 5+2*(n-2) insns for n # of intervals.  */
   else if (size <= 4 * PROBE_INTERVAL)
 {
@@ -3356,11 +3356,12 @@ ia64_emit_probe_stack_range (HOST_WIDE_I
 
   /* Step 3: the loop
 
-	 while (TEST_ADDR != LAST_ADDR)
+	 do
 	   {
 	 TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
 	 probe at TEST_ADDR
 	   }
+	 while (TEST_ADDR != LAST_ADDR)
 
 	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
 	 until it is equal to ROUNDED_SIZE.  */
@@ -3391,36 +3392,33 @@ const char *
 output_probe_stack_range (rtx reg1, rtx reg2)
 {
   static int labelno = 0;
-  char loop_lab[32], end_lab[32];
+  char loop_lab[32];
   rtx xops[3];
 
-  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
-  ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
+  /* Loop.  */
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
 
-  /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
-  xops[0] = reg1;
-  xops[1] = reg2;
-  xops[2] = gen_rtx_REG (BImode, PR_REG (6));
-  output_asm_insn ("cmp.eq %2, %I2 = %0, %1", xops);
-  fprintf (asm_out_file, "\t(%s) br.cond.dpnt ", reg_names [REGNO (xops[2])]);
-  assemble_name_raw (asm_out_file, end_lab);
-  fputc ('\n', asm_out_file);
-
   /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+  xops[0] = reg1;
   xops[1] = GEN_INT (-PROBE_INTERVAL);
   output_asm_insn ("addl %0 = %1, %0", xops);
   fputs ("\t;;\n", asm_out_file);
 
-  /* Probe at TEST_ADDR and branch.  */
+  /* Probe at TEST_ADDR.  */
   output_asm_insn ("probe.w.fault %0, 0", xops);
-  fprintf (asm_out_file, "\tbr ");
+
+  /* Test if TEST_ADDR == LAST_ADDR.  */
+  xops[1] = reg2;
+  xops[2] = gen_rtx_REG (BImode, PR_REG (6));
+  output_asm_insn ("cmp.eq %2, %I2 = %0, %1", xops);
+
+  /* Branch.  */
+  fprintf (asm_out_file, "\t(%s) br.cond.dpnt ", reg_names [PR_REG (7)]);
   assemble_name_raw (asm_out_file, loop_lab);
   fputc ('\n', asm_out_file);
 
-  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, end_lab);
-
   return "";
 }

Re: [i386] Rotate stack checking loop

2015-11-12 Thread Uros Bizjak

Hello!

> this patch rotates the loop generated in the prologue to do stack checking
> when -fstack-check is specified, thereby saving one branch instruction.  It
> was initially implemented as a WHILE loop to match the generic implementation
> but can be turned into a DO-WHILE loop because the amount of stack to be
> checked is known at compile time (since it's the static part of the frame).
>
> The patch also changes a mov+sub pair into an lea in the common case on Linux,
> saving one more instruction in the process.
>
> Tested on x86/Linux & x86-64/Linux (ix86_adjust_stack_and_probe path) and
> x86/Solaris (ix86_emit_probe_stack_range path).  OK for the mainline?
>
>
> 2015-11-12  Eric Botcazou  
>
> * config/i386/i386.c (ix86_adjust_stack_and_probe): Adjust and use
> an lea instruction when possible.
> (output_adjust_stack_and_probe): Rotate the loop and simplify.
> (ix86_emit_probe_stack_range): Adjust.
> (output_probe_stack_range): Rotate the loop and simplify.

OK.

Thanks,
Uros.

Re: [PATCH 02/02] C FE: add fix-it hint for . vs ->

2015-11-12 Thread David Malcolm

On Tue, 2015-11-10 at 17:55 +, Joseph Myers wrote:
> On Tue, 10 Nov 2015, David Malcolm wrote:
> 
> > This is the most trivial example of a real fix-it example I could think
> > of: if the user writes
> > ptr.field
> > rather than ptr->field.
> > 
> > gcc/c/ChangeLog:
> > * c-typeck.c (build_component_ref): Special-case POINTER_TYPE when
> > generating a "not a structure of union"  error message, and
> > suggest a "->" rather than a ".", providing a fix-it hint.
> 
> I wonder if this should be restricted to the case where the pointer's 
> target is of structure or union type.  

Probably.  Attached is an updated version of the patch that does so.

> At least, if it's some other type, 
> more of a fix is needed than just using -> (e.g. converting from void * to 
> a pointer to the relevant type).

If so, then the attached patch simply does the status quo (I don't think
we want to try too hard for this case).  I've added a test case for
this.

>From 12aa183d693db59cbc5d8a268749d577e729425c Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Tue, 8 Sep 2015 12:56:00 -0400
Subject: [PATCH] C FE: add fix-it hint for . vs ->

This is the most trivial example of a real fix-it example I could think
of: if the user writes
	ptr.field
rather than ptr->field.

gcc/c/ChangeLog:
	* c-typeck.c (should_suggest_deref_p): New function.
	(build_component_ref): Special-case POINTER_TYPE when
	generating a "not a structure of union"  error message, and
	suggest a "->" rather than a ".", providing a fix-it hint.

gcc/testsuite/ChangeLog:
	* gcc.dg/fixits.c: New file.
---
 gcc/c/c-typeck.c  | 39 +++
 gcc/testsuite/gcc.dg/fixits.c | 41 +
 2 files changed, 80 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/fixits.c

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index c2e16c6..bdb2d12 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -2249,6 +2249,33 @@ lookup_field (tree type, tree component)
   return tree_cons (NULL_TREE, field, NULL_TREE);
 }
 
+/* Support function for build_component_ref's error-handling.
+
+   Given DATUM_TYPE, and "DATUM.COMPONENT", where DATUM is *not* a
+   struct or union, should we suggest "DATUM->COMPONENT" as a hint?  */
+
+static bool
+should_suggest_deref_p (tree datum_type)
+{
+  /* We don't do it for Objective-C, since Objective-C 2.0 dot-syntax
+ allows "." for ptrs; we could be handling a failed attempt
+ to access a property.  */
+  if (c_dialect_objc ())
+return false;
+
+  /* Only suggest it for pointers...  */
+  if (TREE_CODE (datum_type) != POINTER_TYPE)
+return false;
+
+  /* ...to structs/unions.  */
+  tree underlying_type = TREE_TYPE (datum_type);
+  enum tree_code code = TREE_CODE (underlying_type);
+  if (code == RECORD_TYPE || code == UNION_TYPE)
+return true;
+  else
+return false;
+}
+
 /* Make an expression to refer to the COMPONENT field of structure or
union value DATUM.  COMPONENT is an IDENTIFIER_NODE.  LOC is the
location of the COMPONENT_REF.  */
@@ -2336,6 +2363,18 @@ build_component_ref (location_t loc, tree datum, tree component)
 
   return ref;
 }
+  else if (should_suggest_deref_p (type))
+{
+  /* Special-case the error message for "ptr.field" for the case
+	 where the user has confused "." vs "->".  */
+  rich_location richloc (line_table, loc);
+  /* "loc" should be the "." token.  */
+  richloc.add_fixit_replace (source_range::from_location (loc), "->");
+  error_at_rich_loc (,
+			 "%qE is a pointer; did you mean to use %<->%>?",
+			 datum);
+  return error_mark_node;
+}
   else if (code != ERROR_MARK)
 error_at (loc,
 	  "request for member %qE in something not a structure or union",
diff --git a/gcc/testsuite/gcc.dg/fixits.c b/gcc/testsuite/gcc.dg/fixits.c
new file mode 100644
index 000..06c9995
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fixits.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+struct foo { int x; };
+union u { int x; };
+
+/* Verify that we issue a hint for "." used with a ptr to a struct.  */
+
+int test_1 (struct foo *ptr)
+{
+  return ptr.x; /* { dg-error "'ptr' is a pointer; did you mean to use '->'?" } */
+/* { dg-begin-multiline-output "" }
+   return ptr.x;
+ ^
+ ->
+   { dg-end-multiline-output "" } */
+}
+
+/* Likewise for a ptr to a union.  */
+
+int test_2 (union u *ptr)
+{
+  return ptr.x; /* { dg-error "'ptr' is a pointer; did you mean to use '->'?" } */
+/* { dg-begin-multiline-output "" }
+   return ptr.x;
+ ^
+ ->
+   { dg-end-multiline-output "" } */
+}
+
+/* Verify that we don't issue a hint for a ptr to something that isn't a
+   struct or union.  */
+
+int test_3 (void **ptr)
+{
+  return ptr.x; /* { dg-error "request for member 'x' in something not a structure or union" } */
+/* { dg-begin-multiline-output ""

Re: [PATCH], Add power9 support to GCC, patch #7 (direct move enhancements)

2015-11-12 Thread David Edelsohn

On Sun, Nov 8, 2015 at 7:48 PM, Michael Meissner
 wrote:
> This patch adds support for the new direct move instructions (MFVSRLD and
> MTVSRDD) that simplify moving 128-bit data between GPRs and vector registers.
>
> I have built previous versions of this patch with no regressions.  At the
> moment, I have built a non-bootstrap build and ran the PowerPC tests, with no
> regressions.  Assuming the bootstrap build that I've started has no
> regressions, is it ok to install in the trunk?
>
> [gcc]
> 2015-11-08  Michael Meissner  
>
> * config/rs6000/constraints.md (we constraint): New constraint for
> 64-bit power9 vector support.
> (wL constraint): New constraint for the element in a vector that
> can be addressed by the MFVSRLD instruction.
>
> * config/rs6000/rs6000.c (rs6000_debug_reg_global): Add ISA 3.0
> debugging.
> (rs6000_init_hard_regno_mode_ok): If ISA 3.0 and 64-bit, enable we
> constraint.  Disable the VSX<->GPR direct move helpers if we have
> the MFVSRLD and MTVSRDD instructions.
> (rs6000_secondary_reload_simple_move): Add support for doing
> vector direct moves directly without additional scratch registers
> if we have ISA 3.0 instructions.
> (rs6000_secondary_reload_direct_move): Update comments.
> (rs6000_output_move_128bit): Add support for ISA 3.0 vector
> instructions.
>
> * config/rs6000/vsx.md (vsx_mov): Add support for ISA 3.0
> direct move instructions.
> (vsx_movti_64bit): Likewise.
> (vsx_extract_): Likewise.
>
> * config/rs6000/rs6000.h (VECTOR_ELEMENT_MFVSRLD_64BIT): New
> macros for ISA 3.0 direct move instructions.
> (TARGET_DIRECT_MOVE_128): Likewise.
>
> * config/rs6000/rs6000.md (128-bit GPR splitters): Don't split a
> 128-bit move that is a direct move between GPR and vector
> registers using ISA 3.0 direct move instructions.
>
> * doc/md.texi (RS/6000 constraints): Document we, wF, wG, wL
> constraints.  Update wa documentation to say not to use %x on
> instructions that only take Altivec registers.
>
> [gcc/testsuite]
> 2015-11-08  Michael Meissner  
>
> * gcc.target/powerpc/direct-move-vector.c: New test for 128-bit
> vector direct move instructions.

This is okay.

Thanks, David

[C PATCH] Fix parsing when using declarations in for loops and typedefs (PR c/67784)

2015-11-12 Thread Marek Polacek

As explained in the PR, the issue here was that we were treating a TYPENAME
wrongly as an ID.  That happened because we were using information from the
wrong scope when parsing a token after an else clause.  I.e. in fn1 in the
attached testcase we need to examine the token after "if (1);" to see if it's
the "else" keyword, but when it's not, we use the scope of the for loop when
classifying the token, so we wrongly see "T" as a identifier of a variable.
Fixed by examining the token again and reclassifying it.

Moreover, we were ICEing in a similar scenario, treating ID as a TYPENAME, as
demonstrated in pr67784-2.c.  The fix is analogical.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-11-12  Marek Polacek  

PR c/67784
* c-parser.c (c_parser_for_statement): Reclassify the token in
a correct scope.

* gcc.dg/pr67784-1.c: New test.
* gcc.dg/pr67784-2.c: New test.

diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 2484b92..8949825 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -5749,6 +5749,21 @@ c_parser_for_statement (c_parser *parser, bool ivdep)
 c_finish_loop (loc, cond, incr, body, c_break_label, c_cont_label, true);
   add_stmt (c_end_compound_stmt (loc, block, flag_isoc99 || c_dialect_objc 
()));
 
+  /* We might need to reclassify any previously-lexed identifier, e.g.
+ when we've left a for loop with an if-statement without else in the
+ body - we might have used a wrong scope for the token.  See PR67784.  */
+  if (c_parser_next_token_is (parser, CPP_NAME))
+{
+  c_token *token = c_parser_peek_token (parser);
+  tree decl = lookup_name (token->value);
+  if (decl == NULL_TREE)
+   ;
+  else if (TREE_CODE (decl) == TYPE_DECL)
+   token->id_kind = C_ID_TYPENAME;
+  else if (VAR_P (decl))
+   token->id_kind = C_ID_ID;
+}
+
   token_indent_info next_tinfo
 = get_token_indent_info (c_parser_peek_token (parser));
   warn_for_misleading_indentation (for_tinfo, body_tinfo, next_tinfo);
diff --git gcc/testsuite/gcc.dg/pr67784-1.c gcc/testsuite/gcc.dg/pr67784-1.c
index e69de29..d5e85fc 100644
--- gcc/testsuite/gcc.dg/pr67784-1.c
+++ gcc/testsuite/gcc.dg/pr67784-1.c
@@ -0,0 +1,54 @@
+/* PR c/67784 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+typedef int T;
+
+void
+fn1 (void)
+{
+  for (int T;;)
+if (1)
+  ;
+  T *x;
+}
+
+void
+fn2 (void)
+{
+  for (int T;;)
+if (1)
+  T = 1;
+  T *x;
+}
+
+void
+fn3 (void)
+{
+  for (int T;;)
+if (1)
+  {
+  }
+  T *x;
+}
+
+void
+fn4 (void)
+{
+  for (int T;;)
+if (1)
+L:
+  ;
+  T *x;
+}
+
+void
+fn5 (void)
+{
+  for (int T;;)
+if (1)
+  ;
+else
+  ;
+  T *x;
+}
diff --git gcc/testsuite/gcc.dg/pr67784-2.c gcc/testsuite/gcc.dg/pr67784-2.c
index e69de29..de3b1c8 100644
--- gcc/testsuite/gcc.dg/pr67784-2.c
+++ gcc/testsuite/gcc.dg/pr67784-2.c
@@ -0,0 +1,54 @@
+/* PR c/67784 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+int T;
+
+void
+fn1 (void)
+{
+  for (typedef int T;;) /* { dg-error "declaration of non-variable" } */
+if (1)
+  ;
+  T *x; /* { dg-error "undeclared" } */
+}
+
+void
+fn2 (void)
+{
+  for (typedef int T;;) /* { dg-error "declaration of non-variable" } */
+if (1)
+  T = 1; /* { dg-error "expected expression" } */
+  T *x; /* { dg-error "undeclared" } */
+}
+
+void
+fn3 (void)
+{
+  for (typedef int T;;) /* { dg-error "declaration of non-variable" } */
+if (1)
+  {
+  }
+  T *x; /* { dg-error "undeclared" } */
+}
+
+void
+fn4 (void)
+{
+  for (typedef int T;;) /* { dg-error "declaration of non-variable" } */
+if (1)
+L:
+  ;
+  T *x; /* { dg-error "undeclared" } */
+}
+
+void
+fn5 (void)
+{
+  for (typedef int T;;) /* { dg-error "declaration of non-variable" } */
+if (1)
+  ;
+else
+  ;
+  T *x; /* { dg-error "undeclared" } */
+}

Marek

Re: open acc default data attribute

2015-11-12 Thread David Edelsohn

Nathan,

The ChangeLog was placed in the wrong files.

gcc/
* gimplify.c (oacc_default_clause): New.
(omp_notice_variable): Call it.

Should go in gcc/ChangeLog without "gcc/"

gcc/testsuite/
* c-c++-common/goacc/data-default-1.c: New.

should go in gcc/testsuite/ChangeLog

libgomp/
* testsuite/libgomp.oacc-c-c++-common/default-1.c: New.

should go in libgomp/ChangeLog

Thanks, David

Re: Gimple loop splitting

2015-11-12 Thread Jeff Law


On 11/12/2015 09:52 AM, Michael Matz wrote:

Hello,

this new pass implements loop iteration space splitting for loops that
contain a conditional that's always true for one part of the iteration
space and false for the other, i.e. such situations:
FWIW, Ajit suggested the same transformation earlier this year.  During 
that discussion Richi indicated that for hmmer this transformation would 
enable vectorization.




This transformation is in itself a good one but can also be an enabler for
the vectorizer.

Agreed.


  It does increase code size, when the loop body contains

also unconditional code (that one is duplicated), so we only transform hot
loops.

Probably ought to be disabled when we're not optimizing for speed as well.




  I'm a bit unsure of the placement of the new pass, or if it should

be an own pass at all.  Right now I've placed it after unswitching and
scev_cprop, before loop distribution.  Ideally I think all three, together
with loop fusion and an gimple unroller should be integrated into one loop
nest optimizer, alas, we aren't there yet.
Given its impact on the looping structure, I'd think early in the loop 
optimizer.  Given the similarities with unswitching, I think 
before/after unswitching is a natural first cut.  We can always iterate 
if it looks like putting it elsewhere would make sense.





I've regstrapped this pass enabled with -O2 on x86-64-linux, without
regressions.  I've also checked cpu2006 (the non-fortran part) for
correctness, not yet for performance.  In the end it should probably only
be enabled for -O3+ (although if the whole loop body is conditional it
makes sense to also have it with -O2 because code growth is very small
then).
Very curious on the performance side, so if you could get some #s on 
that, it'd be greatly appreciated.


I'd be comfortable with this at -O2, but won't object if you'd prefer -O3.




So, okay for trunk?


Ciao,
Michael.
* passes.def (pass_loop_split): Add.
* timevar.def (TV_LOOP_SPLIT): Add.
* tree-pass.h (make_pass_loop_split): Declare.
* tree-ssa-loop-manip.h (rewrite_into_loop_closed_ssa_1): Declare.
* tree-ssa-loop-unswitch.c: Include tree-ssa-loop-manip.h,
cfganal.h, tree-chrec.h, tree-affine.h, tree-scalar-evolution.h,
gimple-pretty-print.h, gimple-fold.h, gimplify-me.h.
(split_at_bb_p, patch_loop_exit, find_or_create_guard_phi,
split_loop, tree_ssa_split_loops,
make_pass_loop_split): New functions.
(pass_data_loop_split): New.
(pass_loop_split): New.

testsuite/
* gcc.dg/loop-split.c: New test.
Please clean up the #if 0/#if 1 code in the new tests.  You might also 
want to clean out the TRACE stuff.  Essentially the tests look like you 
just dropped in a test you'd been running by hand until now :-)


I don't see any negative tests -- ie tests that should not be split due 
to boundary conditions.  Do you have any from development?  If so it'd 
be good to have those too.




Index: tree-ssa-loop-manip.h
===
--- tree-ssa-loop-manip.h   (revision 229763)
+++ tree-ssa-loop-manip.h   (working copy)
@@ -24,6 +24,8 @@ typedef void (*transform_callback)(struc

  extern void create_iv (tree, tree, tree, struct loop *, gimple_stmt_iterator 
*,
   bool, tree *, tree *);
+extern void rewrite_into_loop_closed_ssa_1 (bitmap, unsigned, int,
+   struct loop *);
  extern void rewrite_into_loop_closed_ssa (bitmap, unsigned);
  extern void rewrite_virtuals_into_loop_closed_ssa (struct loop *);
  extern void verify_loop_closed_ssa (bool);
Index: tree-ssa-loop-unswitch.c
===
--- tree-ssa-loop-unswitch.c(revision 229763)
+++ tree-ssa-loop-unswitch.c(working copy)
Given the amount of new code, unless there's a strong need, I'd prefer 
this transformation to be implemented in its own file.





+
+/* Give an induction variable GUARD_IV, and its affine descriptor IV,
+   find the loop phi node in LOOP defining it directly, or create
+   such phi node.  Return that phi node.  */
+
+static gphi *
+find_or_create_guard_phi (struct loop *loop, tree guard_iv, affine_iv * /*iv*/)
+{
+  gimple *def = SSA_NAME_DEF_STMT (guard_iv);
+  gphi *phi;
+  if ((phi = dyn_cast  (def))
+  && gimple_bb (phi) == loop->header)
+return phi;
+
+  /* XXX Create the PHI instead.  */
+  return NULL;
So right now we just punt if we need to create the PHI?  Does that 
happen with any kind of regularity in practice?




+}
+
+/* Checks if LOOP contains an conditional block whose condition
+   depends on which side in the iteration space it is, and if so
+   splits the iteration space into two loops.  Returns true if the
+   loop was split.  NITER must contain the iteration descriptor for the
+   single exit of LOOP.  */
+
+static bool
+split_loop (struct loop *loop,

Re: [C PATCH] Fix parsing when using declarations in for loops and typedefs (PR c/67784)

2015-11-12 Thread Joseph Myers

On Thu, 12 Nov 2015, Marek Polacek wrote:

> As explained in the PR, the issue here was that we were treating a TYPENAME
> wrongly as an ID.  That happened because we were using information from the
> wrong scope when parsing a token after an else clause.  I.e. in fn1 in the
> attached testcase we need to examine the token after "if (1);" to see if it's
> the "else" keyword, but when it's not, we use the scope of the for loop when
> classifying the token, so we wrongly see "T" as a identifier of a variable.
> Fixed by examining the token again and reclassifying it.
> 
> Moreover, we were ICEing in a similar scenario, treating ID as a TYPENAME, as
> demonstrated in pr67784-2.c.  The fix is analogical.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2015-11-12 Thread Jeff Law


On 11/12/2015 11:32 AM, Jeff Law wrote:

On 11/12/2015 10:05 AM, Jeff Law wrote:

But IIRC you mentioned it should enable vectorization or so?  In this
case
that's obviously too late.

The opposite.  Path splitting interferes with if-conversion &
vectorization.  Path splitting mucks up the CFG enough that
if-conversion won't fire and as a result vectorization is inhibited.  It
also creates multi-latch loops, which isn't a great situation either.

It *may* be the case that dropping it that far down in the pipeline and
making the modifications necessary to handle simple latches may in turn
make the path splitting code play better with if-conversion and
vectorization and avoid creation of multi-latch loops.  At least that's
how it looks on paper when I draw out the CFG manipulations.

I'll do some experiments.

It doesn't look too terrible to ravamp the recognition code to work
later in the pipeline with simple latches.  Sadly that doesn't seem to
have fixed the bad interactions with if-conversion.

*But* that does open up the possibility of moving the path splitting
pass even deeper in the pipeline -- in particular we can move it past
the vectorizer.  Which is may be a win.

So the big question is whether or not we'll still see enough benefits
from having it so late in the pipeline.  It's still early enough that we
get DOM, VRP, reassoc, forwprop, phiopt, etc.

Ajit, I'll pass along an updated patch after doing some more testing.

So here's what I'm working with.  It runs after the vectorizer now.

Ajit, if you could benchmark this it would be greatly appreciated.  I 
know you saw significant improvements on one or more benchmarks in the 
past.  It'd be good to know that the updated placement of the pass 
doesn't invalidate the gains you saw.


With the updated pass placement, we don't have to worry about switching 
the pass on/off based on whether or not the vectorizer & if-conversion 
are enabled.  So that hackery is gone.


I think I've beefed up the test to identify the diamond patterns we want 
so that it's stricter in what we accept.  The call to ignore_bb_p is a 
part of that test so that we're actually looking at the right block in a 
world where we're doing this transformation with simple latches.


I've also put a graphical comment before perform_path_splitting which 
hopefully shows the CFG transformation we're making a bit clearer.


This bootstraps and regression tests cleanly on x86_64-linux-gnu.


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 34d2356..6613e83 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1474,6 +1474,7 @@ OBJS = \
tree-ssa-loop.o \
tree-ssa-math-opts.o \
tree-ssa-operands.o \
+   tree-ssa-path-split.o \
tree-ssa-phionlycprop.o \
tree-ssa-phiopt.o \
tree-ssa-phiprop.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 757ce85..3e946ca 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2403,6 +2403,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-path-split
+Common Report Var(flag_tree_path_split) Init(0) Optimization
+Perform Path Splitting on trees for loop backedges.
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 213a9d0..b1e95da 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -354,6 +354,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-tree-fre@r{[}-@var{n}@r{]} @gol
 -fdump-tree-vtable-verify @gol
 -fdump-tree-vrp@r{[}-@var{n}@r{]} @gol
+-fdump-tree-path-split@r{[}-@var{n}@r{]} @gol
 -fdump-tree-storeccp@r{[}-@var{n}@r{]} @gol
 -fdump-final-insns=@var{file} @gol
 -fcompare-debug@r{[}=@var{opts}@r{]}  -fcompare-debug-second @gol
@@ -462,7 +463,7 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
 -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
 -ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol
--ftree-vectorize -ftree-vrp @gol
+-ftree-vectorize -ftree-vrp @gol -ftree-path-split @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
 -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
@@ -7169,6 +7170,11 @@ output on to @file{stderr}. If two conflicting dump 
filenames are
 given for the same pass, then the latter option overrides the earlier
 one.
 
+@item path-split
+@opindex fdump-tree-path-split
+Dump each function after path splitting.  The file name is made by
+appending @file{.path-split} to the source file name.
+
 @item all
 Turn on all options, except @option{raw}, @option{slim}, @option{verbose}
 and @option{lineno}.
@@ -7811,6 +7817,7 @@ also turns on the following optimization flags:
 -ftree-switch-conversion -ftree-tail-merge @gol
 -ftree-pre @gol
 -ftree-vrp @gol
+-ftree-path-split

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2015-11-12 Thread Richard Biener

On November 12, 2015 7:32:57 PM GMT+01:00, Jeff Law  wrote:
>On 11/12/2015 10:05 AM, Jeff Law wrote:
>>> But IIRC you mentioned it should enable vectorization or so?  In
>this
>>> case
>>> that's obviously too late.
>> The opposite.  Path splitting interferes with if-conversion &
>> vectorization.  Path splitting mucks up the CFG enough that
>> if-conversion won't fire and as a result vectorization is inhibited. 
>It
>> also creates multi-latch loops, which isn't a great situation either.
>>
>> It *may* be the case that dropping it that far down in the pipeline
>and
>> making the modifications necessary to handle simple latches may in
>turn
>> make the path splitting code play better with if-conversion and
>> vectorization and avoid creation of multi-latch loops.  At least
>that's
>> how it looks on paper when I draw out the CFG manipulations.
>>
>> I'll do some experiments.
>It doesn't look too terrible to ravamp the recognition code to work 
>later in the pipeline with simple latches.  Sadly that doesn't seem to 
>have fixed the bad interactions with if-conversion.
>
>*But* that does open up the possibility of moving the path splitting 
>pass even deeper in the pipeline -- in particular we can move it past 
>the vectorizer.  Which is may be a win.
>
>So the big question is whether or not we'll still see enough benefits 
>from having it so late in the pipeline.  It's still early enough that
>we 
>get DOM, VRP, reassoc, forwprop, phiopt, etc.
>
>Ajit, I'll pass along an updated patch after doing some more testing.

BTW, if you not use loops_normal for loop init you don't get simple latches 
forced (and cfg-cleanup will remove them)

Richard.

>Jeff

C++ PATCH to checking of explicit instantiation namespace

2015-11-12 Thread Jason Merrill

In the testcase, the using-declaration was confusing namespace 
comparison into thinking that we were instantiating a template from an 
enclosing namespace.  Given using-declarations, we need to wait until 
we've chosen a template before donig this comparison.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 43aa655785ca8da40d07e1320f532950fc9a83b5
Author: Jason Merrill 
Date:   Thu Nov 12 11:22:04 2015 -0500

	* pt.c (check_explicit_specialization): Check the namespace after
	we choose a template.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 62659ec..2e3d48b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -2800,14 +2800,6 @@ check_explicit_specialization (tree declarator,
 		  error ("%qD is not a template function", dname);
 		  fns = error_mark_node;
 		}
-	  else
-		{
-		  tree fn = OVL_CURRENT (fns);
-		  if (!is_associated_namespace (CP_DECL_CONTEXT (decl),
-		CP_DECL_CONTEXT (fn)))
-		error ("%qD is not declared in %qD",
-			   decl, current_namespace);
-		}
 	}
 
 	  declarator = lookup_template_function (fns, NULL_TREE);
@@ -2941,6 +2933,14 @@ check_explicit_specialization (tree declarator,
 	return error_mark_node;
   else
 	{
+	  if (!ctype && !was_template_id
+	  && (specialization || member_specialization
+		  || explicit_instantiation)
+	  && !is_associated_namespace (CP_DECL_CONTEXT (decl),
+	   CP_DECL_CONTEXT (tmpl)))
+	error ("%qD is not declared in %qD",
+		   tmpl, current_namespace);
+
 	  tree gen_tmpl = most_general_template (tmpl);
 
 	  if (explicit_instantiation)
diff --git a/gcc/testsuite/g++.dg/template/explicit-instantiation4.C b/gcc/testsuite/g++.dg/template/explicit-instantiation4.C
new file mode 100644
index 000..72417b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/explicit-instantiation4.C
@@ -0,0 +1,7 @@
+void f();
+
+namespace A {
+  template  void f(T) { }
+  using ::f;
+  template void f(int);
+}

Re: [PATCH], Add power9 support to GCC, patch #6 (IEEE 128-bit hardware support)

2015-11-12 Thread David Edelsohn

On Sun, Nov 8, 2015 at 7:44 PM, Michael Meissner
 wrote:
> This patch adds support for the IEEE 128-bit hardware instructions that are
> being added to the PowerPC ISA 3.0 (power9).  With this patch, users on power7
> and power8 will use the software emulation functions that are committed, but
> still need some enhancment.  On ISA 3.0/power9, they would be able to use the
> direct instructions.
>
> I have built this patch with a bootstrap build on a power8 little endian
> system.  There were no regressions in the test suite.  Is this patch ok to
> install in the trunk?
>
> [gcc]
> 2015-11-08  Michael Meissner  
>
> * config/rs6000/rs6000-protos.h (convert_float128_to_int): Add
> declaration.
> (convert_int_to_float128): Likewise.
> (rs6000_generate_compare): Add support for ISA 3.0 (power9)
> hardware support for IEEE 128-bit floating point.
> (rs6000_expand_float128_convert): Likewise.
> (convert_float128_to_int): Likewise.
> (convert_int_to_float128): Likewise.
>
> * config/rs6000/rs6000.md (UNSPEC_ROUND_TO_ODD): New unspecs for
> ISA 3.0 hardware IEEE 128-bit floating point.
> (UNSPEC_IEEE128_MOVE): Likewise.
> (UNSPEC_IEEE128_CONVERT): Likewise.
> (FMA_F): Add support for IEEE 128-bit floating point hardware
> support.
> (Ff): Add support for DImode.
> (Fv): Likewise.
> (any_fix code iterator): New and updated iterators for IEEE
> 128-bit floating point hardware support.
> (any_float code iterator): Likewise.
> (s code attribute): Likewise.
> (su code attribute): Likewise.
> (az code attribute): Likewise.
> (neg2, FLOAT128 iterator): Add support for IEEE 128-bit
> floating point hardware support.
> (abs2, FLOAT128 iterator): Likewise.
> (add3, IEEE128 iterator): New insns for IEEE 128-bit
> floating point hardware.
> (sub3, IEEE128 iterator): Likewise.
> (mul3, IEEE128 iterator): Likewise.
> (div3, IEEE128 iterator): Likewise.
> (copysign3, IEEE128 iterator): Likewise.
> (sqrt2, IEEE128 iterator): Likewise.
> (neg2, IEEE128 iterator): Likewise.
> (abs2, IEEE128 iterator): Likewise.
> (nabs2, IEEE128 iterator): Likewise.
> (fma4_hw, IEEE128 iterator): Likewise.
> (fms4_hw, IEEE128 iterator): Likewise.
> (nfma4_hw, IEEE128 iterator): Likewise.
> (nfms4_hw, IEEE128 iterator): Likewise.
> (extend2_hw): Likewise.
> (truncdf2_hw, IEEE128 iterator): Likewise.
> (truncsf2_hw, IEEE128 iterator): Likewise.
> (fix_fixuns code attribute): Likewise.
> (float_floatuns code attribute): Likewise.
> (_si2_hw): Likewise.
> (_di2_hw): Likewise.
> (_si2_hw): Likewise.
> (_di2_hw): Likewise.
> (xscvqpwz_): Likewise.
> (xscvqpdz_): Likewise.
> (xscvdqp_ (ieee128_mfvsrd): Likewise.
> (ieee128_mfvsrwz): Likewise.
> (ieee128_mtvsrw): Likewise.
> (ieee128_mtvsrd): Likewise.
> (truncdf2_odd): Likewise.
> (cmp_h): Likewise.
>
> [gcc/testsuite]
> 2015-11-08  Michael Meissner  
>
> * gcc.target/powerpc/float128-hw.c: New test for IEEE 128-bit
> hardware floating point support.

Please change the attribute to "uns" as suggested by Segher.

> +(define_code_attr fix_fixuns  [(fix   "fix")   (unsigned_fix
"fixuns")])
> +(define_code_attr float_floatuns [(float "float")
(unsigned_float "floatuns")])

You could instead do an "uns" attribute so you would write fix etc.

Okay with that change.

We need to think more about ieee128_mtvsw pattern.

Thanks, David

Re: open acc default data attribute

2015-11-12 Thread Nathan Sidwell


On 11/12/15 15:22, David Edelsohn wrote:

Nathan,

The ChangeLog was placed in the wrong files.

 gcc/
 * gimplify.c (oacc_default_clause): New.
 (omp_notice_variable): Call it.


Fixed.  I placed the entries in the other files, but failed to cleanup the above 
one.


nathan

[i386] Rotate stack checking loop

2015-11-12 Thread Eric Botcazou

Hi,

this patch rotates the loop generated in the prologue to do stack checking 
when -fstack-check is specified, thereby saving one branch instruction.  It 
was initially implemented as a WHILE loop to match the generic implementation 
but can be turned into a DO-WHILE loop because the amount of stack to be 
checked is known at compile time (since it's the static part of the frame).

The patch also changes a mov+sub pair into an lea in the common case on Linux, 
saving one more instruction in the process.

Tested on x86/Linux & x86-64/Linux (ix86_adjust_stack_and_probe path) and 
x86/Solaris (ix86_emit_probe_stack_range path).  OK for the mainline?


2015-11-12  Eric Botcazou  

* config/i386/i386.c (ix86_adjust_stack_and_probe): Adjust and use
an lea instruction when possible.
(output_adjust_stack_and_probe): Rotate the loop and simplify.
(ix86_emit_probe_stack_range): Adjust.
(output_probe_stack_range): Rotate the loop and simplify.

-- 
Eric BotcazouIndex: config/i386/i386.c
===
--- config/i386/i386.c	(revision 230245)
+++ config/i386/i386.c	(working copy)
@@ -12137,10 +12137,10 @@ ix86_adjust_stack_and_probe (const HOST_
   rtx size_rtx = GEN_INT (size), last;
 
   /* See if we have a constant small number of probes to generate.  If so,
- that's the easy case.  The run-time loop is made up of 11 insns in the
+ that's the easy case.  The run-time loop is made up of 9 insns in the
  generic case while the compile-time loop is made up of 3+2*(n-1) insns
  for n # of intervals.  */
-  if (size <= 5 * PROBE_INTERVAL)
+  if (size <= 4 * PROBE_INTERVAL)
 {
   HOST_WIDE_INT i, adjust;
   bool first_probe = true;
@@ -12207,19 +12207,27 @@ ix86_adjust_stack_and_probe (const HOST_
 	 - (PROBE_INTERVAL + dope;
 
   /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
-  emit_move_insn (sr.reg, GEN_INT (-rounded_size));
-  emit_insn (gen_rtx_SET (sr.reg,
-			  gen_rtx_PLUS (Pmode, sr.reg,
-	stack_pointer_rtx)));
+  if (rounded_size <= (HOST_WIDE_INT_1 << 31))
+	emit_insn (gen_rtx_SET (sr.reg,
+plus_constant (Pmode, stack_pointer_rtx,
+	   -rounded_size)));
+  else
+	{
+	  emit_move_insn (sr.reg, GEN_INT (-rounded_size));
+	  emit_insn (gen_rtx_SET (sr.reg,
+  gen_rtx_PLUS (Pmode, sr.reg,
+		stack_pointer_rtx)));
+	}
 
 
   /* Step 3: the loop
 
-	 while (SP != LAST_ADDR)
+	 do
 	   {
 	 SP = SP + PROBE_INTERVAL
 	 probe at SP
 	   }
+	 while (SP != LAST_ADDR)
 
 	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
 	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
@@ -12275,23 +12283,16 @@ const char *
 output_adjust_stack_and_probe (rtx reg)
 {
   static int labelno = 0;
-  char loop_lab[32], end_lab[32];
+  char loop_lab[32];
   rtx xops[2];
 
-  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
-  ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
 
+  /* Loop.  */
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
 
-  /* Jump to END_LAB if SP == LAST_ADDR.  */
-  xops[0] = stack_pointer_rtx;
-  xops[1] = reg;
-  output_asm_insn ("cmp%z0\t{%1, %0|%0, %1}", xops);
-  fputs ("\tje\t", asm_out_file);
-  assemble_name_raw (asm_out_file, end_lab);
-  fputc ('\n', asm_out_file);
-
   /* SP = SP + PROBE_INTERVAL.  */
+  xops[0] = stack_pointer_rtx;
   xops[1] = GEN_INT (PROBE_INTERVAL);
   output_asm_insn ("sub%z0\t{%1, %0|%0, %1}", xops);
 
@@ -12299,12 +12300,16 @@ output_adjust_stack_and_probe (rtx reg)
   xops[1] = const0_rtx;
   output_asm_insn ("or%z0\t{%1, (%0)|DWORD PTR [%0], %1}", xops);
 
-  fprintf (asm_out_file, "\tjmp\t");
+  /* Test if SP == LAST_ADDR.  */
+  xops[0] = stack_pointer_rtx;
+  xops[1] = reg;
+  output_asm_insn ("cmp%z0\t{%1, %0|%0, %1}", xops);
+
+  /* Branch.  */
+  fputs ("\tjne\t", asm_out_file);
   assemble_name_raw (asm_out_file, loop_lab);
   fputc ('\n', asm_out_file);
 
-  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, end_lab);
-
   return "";
 }
 
@@ -12315,10 +12320,10 @@ static void
 ix86_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
 {
   /* See if we have a constant small number of probes to generate.  If so,
- that's the easy case.  The run-time loop is made up of 7 insns in the
+ that's the easy case.  The run-time loop is made up of 6 insns in the
  generic case while the compile-time loop is made up of n insns for n #
  of intervals.  */
-  if (size <= 7 * PROBE_INTERVAL)
+  if (size <= 6 * PROBE_INTERVAL)
 {
   HOST_WIDE_INT i;
 
@@ -12362,11 +12367,12 @@ ix86_emit_probe_stack_range (HOST_WIDE_I
 
   /* Step 3: the loop
 
-	 while (TEST_ADDR != LAST_ADDR)
+	 do
 	   {
 	 TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
 	 probe at TEST_ADDR
 	   }
+	 while (TEST_ADDR != LAST_ADDR)

[committed] gen-pass-instances.awk: Add emacs indent setting

2015-11-12 Thread Tom de Vries


Hi,

this patch adds emacs indentation settings to gen-pass-instances.awk. 
The default indentation width in emacs awk mode seems to be 4, and this 
setting overrides it to 8, which is the style used in this file.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Add emacs indent setting

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk: Add emacs indent setting.

---
 gcc/gen-pass-instances.awk | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index f36f510..a0be6a1 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -64,3 +64,8 @@ function handle_line()
 }
 
 { handle_line() }
+
+# Local Variables:
+# mode:awk
+# c-basic-offset:8
+# End:

[committed] gen-pass-instances.awk: Use early-out in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch restructures handle_line in gen-pass-instances.awk to use an 
early-out.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Use early-out in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Restructure using early-out.

---
 gcc/gen-pass-instances.awk | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 9eaac65..27e7a98 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -41,25 +41,27 @@ BEGIN {
 function handle_line()
 {
 	line = $0;
+
 	where = match(line, /NEXT_PASS \((.+)\)/);
-	if (where != 0)
+	if (where == 0)
 	{
-		len_of_start = length("NEXT_PASS (");
-		len_of_end = length(")");
-		len_of_pass_name = RLENGTH - (len_of_start + len_of_end);
-		pass_starts_at = where + len_of_start;
-		pass_name = substr(line, pass_starts_at, len_of_pass_name);
-		if (pass_name in pass_counts)
-			pass_counts[pass_name]++;
-		else
-			pass_counts[pass_name] = 1;
-		printf "%s, %s%s\n",
-			substr(line, 1, pass_starts_at + len_of_pass_name - 1),
-			pass_counts[pass_name],
-			substr(line, pass_starts_at + len_of_pass_name);
-	} else {
 		print line;
+		return;
 	}
+
+	len_of_start = length("NEXT_PASS (");
+	len_of_end = length(")");
+	len_of_pass_name = RLENGTH - (len_of_start + len_of_end);
+	pass_starts_at = where + len_of_start;
+	pass_name = substr(line, pass_starts_at, len_of_pass_name);
+	if (pass_name in pass_counts)
+		pass_counts[pass_name]++;
+	else
+		pass_counts[pass_name] = 1;
+	printf "%s, %s%s\n",
+		substr(line, 1, pass_starts_at + len_of_pass_name - 1),
+		pass_counts[pass_name],
+		substr(line, pass_starts_at + len_of_pass_name);
 }
 
 { handle_line() }

Re: open acc default data attribute

2015-11-12 Thread Jakub Jelinek

On Wed, Nov 11, 2015 at 12:19:55PM -0500, Nathan Sidwell wrote:
> this patch implements default data attribute determination.  The current
> behaviour defaults to 'copy' and ignores 'default(none)'. The  patch
> corrects that.
> 
> 1) We emit a diagnostic when 'default(none)' is in effect.  The fortran FE
> emits some artificial decls that it doesn't otherwise annotate, which is why
> we check DECL_ARTIFICIAL.  IIUC Cesar had a patch to address that but it
> needed some reworking?

I don't think treating DECL_ARTIFICIAL specially is a bug of any kind,
there are tons of different artificals even for C/C++ VLAs etc., and user
has no way to put them into any clauses explicitly, so what we do with them
is GCC internal thing.

> 2015-11-11  Nathan Sidwell  
> 
>   gcc/
>   * gimplify.c (oacc_default_clause): New.
>   (omp_notice_variable): Call it.
> 
>   gcc/testsuite/
>   * c-c++-common/goacc/data-default-1.c: New.
> 
>   libgomp/
>   * testsuite/libgomp.oacc-c-c++-common/default-1.c: New.

+  error ("%qE not specified in enclosing OpenACC %s construct",
   
+DECL_NAME (lang_hooks.decls.omp_report_decl (decl)), rkind);   
   
+  error_at (ctx->location, "enclosing OpenACC %s construct", rkind);   
   

I'd use %qs instead of %s.

Otherwise ok.

Jakub

Re: OpenACC Firstprivate

2015-11-12 Thread Thomas Schwinge

Hi Nathan!

Merging back your trunk r230169 into gomp-4_0-branch, for the new
libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-2.c test, I'm
seeing the compiler diagnose as follows (compile with "-Wall -O2"):

source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-2.c: In 
function 'main._omp_fn.1':

source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-2.c:20:17: 
warning: 'val' is used uninitialized in this function [-Wuninitialized]
   ok  = val == 7;
 ^


source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-2.c:9:7: 
note: 'val' was declared here
   int val = 2;
   ^

..., and execution fails ("return 1" from main), so I XFAILed the
execution in the merge commit r230214 on gomp-4_0-branch.  (..., and I
still think that it's a good idea to change the libgomp testsuite to run
with -Wall enabled...)

Do you have an idea what's going on?  Given your preparatory "[gomp4]
Rework gimplifyier region flags",

(thanks!), the merge commit r230214 on gomp-4_0-branch didn't contain any
changes to gcc/gimplify.c, so that can't be it.  It also can't be the
possibly inconsistent usage of gcc/omp-low.c:is_reference vs. "TREE_CODE
(TREE_TYPE ([...])) == REFERENCE_TYPE" in gcc/omp-low.c, because that
doesn't matter for C code anyway (no artificial REFERENCE_TYPEs
generated), right?  So it must be some other change installed on
gomp-4_0-branch but not on trunk.


Grüße
 Thomas


signature.asc
Description: PGP signature

Pointless configure checks for macros

2015-11-12 Thread Jonathan Wakely


PR68307 points out that config/os/mingw32-w64/error_constants.h fails
to define a number of errc constants which correspond to EXXX macros
that are supported on mingw-w64.

Does anyone know why we test explicitly for these macros but not
others?

m4_foreach([syserr], [EOWNERDEAD, ENOTRECOVERABLE, ENOLINK, EPROTO, ENODATA,
 ENOSR, ENOSTR, ETIME, EBADMSG, ECANCELED,
 EOVERFLOW, ENOTSUP, EIDRM, ETXTBSY,
 ECHILD, ENOSPC, EPERM,
 ETIMEDOUT, EWOULDBLOCK],

Why do we even test for these in configure, instead of just checking
whether they exist directly using #ifdef in error_constants.h ?

(This was discussed in
https://gcc.gnu.org/ml/libstdc++/2011-08/msg00125.html where Paolo
questioned the value of these checks, but indicated a preference for
consistency).

A bit of (incomplete) archaeology suggests that at one time we defined
all of these in , but later split them out into
OS-specific error_constants.h files. If we have a file specific to
mingw-w64 can we just uncomment the constants known to be supported by
that target?

This patch uncomments all the constants with a corresponding macro in
mingw-w64-headers/crt/errno.h in the mingw-w64 sources.

2015-11-12  Jonathan Wakely  

PR libstdc++/68307
* config/os/mingw32-w64/error_constants.h: Uncomment all error codes
supported by mingw-w64.

Is there any problem doing this?

diff --git a/libstdc++-v3/config/os/mingw32-w64/error_constants.h b/libstdc++-v3/config/os/mingw32-w64/error_constants.h
index 0168b5f..e0211bf 100644
--- a/libstdc++-v3/config/os/mingw32-w64/error_constants.h
+++ b/libstdc++-v3/config/os/mingw32-w64/error_constants.h
@@ -41,22 +41,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // replaced by Winsock WSA-prefixed equivalents.
   enum class errc
 {
-//address_family_not_supported = 		EAFNOSUPPORT,
-//address_in_use = EADDRINUSE,
-//address_not_available = 			EADDRNOTAVAIL,
-//already_connected = 			EISCONN,
+  address_family_not_supported = 		EAFNOSUPPORT,
+  address_in_use = EADDRINUSE,
+  address_not_available = 			EADDRNOTAVAIL,
+  already_connected = 			EISCONN,
   argument_list_too_long = 			E2BIG,
   argument_out_of_domain = 			EDOM,
   bad_address = EFAULT,
   bad_file_descriptor = 			EBADF,
 //bad_message = EBADMSG,
   broken_pipe = EPIPE,
-//connection_aborted = 			ECONNABORTED,
-//connection_already_in_progress = 		EALREADY,
-//connection_refused = 			ECONNREFUSED,
-//connection_reset = 			ECONNRESET,
-//cross_device_link = 			EXDEV,
-//destination_address_required = 		EDESTADDRREQ,
+  connection_aborted = 			ECONNABORTED,
+  connection_already_in_progress = 		EALREADY,
+  connection_refused = 			ECONNREFUSED,
+  connection_reset = 			ECONNRESET,
+  cross_device_link = 			EXDEV,
+  destination_address_required = 		EDESTADDRREQ,
   device_or_resource_busy = 		EBUSY,
   directory_not_empty = 			ENOTEMPTY,
   executable_format_error = 		ENOEXEC,
@@ -64,7 +64,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   file_too_large = EFBIG,
   filename_too_long = 			ENAMETOOLONG,
   function_not_supported = 			ENOSYS,
-//host_unreachable = 			EHOSTUNREACH,
+  host_unreachable = 			EHOSTUNREACH,
 //identifier_removed = 			EIDRM,
   illegal_byte_sequence = 			EILSEQ,
   inappropriate_io_control_operation = 	ENOTTY,
@@ -73,11 +73,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   invalid_seek = ESPIPE,
   io_error = EIO,
   is_a_directory = EISDIR,
-//message_size = EMSGSIZE,
-//network_down = ENETDOWN,
-//network_reset = ENETRESET,
-//network_unreachable = 			ENETUNREACH,
-//no_buffer_space = 			ENOBUFS,
+  message_size = EMSGSIZE,
+  network_down = ENETDOWN,
+  network_reset = ENETRESET,
+  network_unreachable = 			ENETUNREACH,
+  no_buffer_space = 			ENOBUFS,
 #ifdef _GLIBCXX_HAVE_ECHILD
   no_child_process = 			ECHILD,
 #endif
@@ -85,7 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   no_lock_available = 			ENOLCK,
 //no_message_available = 			ENODATA,
 //no_message = ENOMSG,
-//no_protocol_option = 			ENOPROTOOPT,
+  no_protocol_option = 			ENOPROTOOPT,
 #ifdef _GLIBCXX_HAVE_ENOSPC
   no_space_on_device = 			ENOSPC,
 #endif
@@ -95,26 +95,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   no_such_file_or_directory = 		ENOENT,
   no_such_process = 			ESRCH,
   not_a_directory = 			ENOTDIR,
-//not_a_socket = ENOTSOCK,
+  not_a_socket = ENOTSOCK,
 //not_a_stream = ENOSTR,
-//not_connected = ENOTCONN,
+  not_connected = ENOTCONN,
   not_enough_memory = 			ENOMEM,
 #ifdef _GLIBCXX_HAVE_ENOTSUP
   not_supported = ENOTSUP,
 #endif
-//operation_canceled = 			ECANCELED,
-//operation_in_progress = 			EINPROGRESS,
+  operation_canceled = 			ECANCELED,
+

Re: Recent patch craters vector tests on powerpc64le-linux-gnu

2015-11-12 Thread James Greenhalgh

On Wed, Nov 11, 2015 at 05:12:29PM -0600, Bill Schmidt wrote:
> Hi Ilya,
> 
> The patch committed as r230098 has caused a number of ICEs on
> powerpc64le-linux-gnu.

And arm-none-linux-gnueabihf, and aarch64-none-linux-gnu.

> Could you please either revert the patch or fix these issues?
 
Thanks,
James

Re: Recent patch craters vector tests on powerpc64le-linux-gnu

2015-11-12 Thread Andreas Schwab

Bill Schmidt  writes:

> The patch committed as r230098 has caused a number of ICEs on
> powerpc64le-linux-gnu.

This is PR68296.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [mask-vec_cond, patch 1/2] Support vectorization of VEC_COND_EXPR with no embedded comparison

2015-11-12 Thread Ramana Radhakrishnan

On Thu, Oct 8, 2015 at 4:50 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patch allows COND_EXPR with no embedded comparison to be vectorized.
>  It's applied on top of vectorized comparison support series.  New optab 
> vcond_mask_optab
> is introduced for such statements.  Bool patterns now avoid comparison in 
> COND_EXPR in case vector comparison is supported by target.

New standard pattern names are documented in the internals manual.
This patch does not do so neither do I see any patches to do so.


regards
Ramana


>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-10-08  Ilya Enkovich  
>
> * optabs-query.h (get_vcond_mask_icode): New.
> * optabs-tree.c (expand_vec_cond_expr_p): Use
> get_vcond_mask_icode for VEC_COND_EXPR with mask.
> * optabs.c (expand_vec_cond_mask_expr): New.
> (expand_vec_cond_expr): Use get_vcond_mask_icode
> when possible.
> * optabs.def (vcond_mask_optab): New.
> * tree-vect-patterns.c (vect_recog_bool_pattern): Don't
> generate redundant comparison for COND_EXPR.
> * tree-vect-stmts.c (vect_is_simple_cond): Allow SSA_NAME
> as a condition.
> (vectorizable_condition): Likewise.
>
>
> diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
> index 162d2e9..48bcf7c 100644
> --- a/gcc/optabs-query.h
> +++ b/gcc/optabs-query.h
> @@ -98,6 +98,15 @@ get_vcond_icode (machine_mode vmode, machine_mode cmode, 
> bool uns)
>return icode;
>  }
>
> +/* Return insn code for a conditional operator with a mask mode
> +   MMODE resulting in a value of mode VMODE.  */
> +
> +static inline enum insn_code
> +get_vcond_mask_icode (machine_mode vmode, machine_mode mmode)
> +{
> +  return convert_optab_handler (vcond_mask_optab, vmode, mmode);
> +}
> +
>  /* Enumerates the possible extraction_insn operations.  */
>  enum extraction_pattern { EP_insv, EP_extv, EP_extzv };
>
> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
> index aa863cf..d887619 100644
> --- a/gcc/optabs-tree.c
> +++ b/gcc/optabs-tree.c
> @@ -342,6 +342,9 @@ expand_vec_cond_expr_p (tree value_type, tree cmp_op_type)
>  {
>machine_mode value_mode = TYPE_MODE (value_type);
>machine_mode cmp_op_mode = TYPE_MODE (cmp_op_type);
> +  if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type))
> +return get_vcond_mask_icode (TYPE_MODE (value_type),
> +TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing;
>if (GET_MODE_SIZE (value_mode) != GET_MODE_SIZE (cmp_op_mode)
>|| GET_MODE_NUNITS (value_mode) != GET_MODE_NUNITS (cmp_op_mode)
>|| get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type),
> diff --git a/gcc/optabs.c b/gcc/optabs.c
> index ca1a6e7..d26b8f8 100644
> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -5346,6 +5346,38 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, 
> rtx sel, rtx target)
>return tmp;
>  }
>
> +/* Generate insns for a VEC_COND_EXPR with mask, given its TYPE and its
> +   three operands.  */
> +
> +rtx
> +expand_vec_cond_mask_expr (tree vec_cond_type, tree op0, tree op1, tree op2,
> +  rtx target)
> +{
> +  struct expand_operand ops[4];
> +  machine_mode mode = TYPE_MODE (vec_cond_type);
> +  machine_mode mask_mode = TYPE_MODE (TREE_TYPE (op0));
> +  enum insn_code icode = get_vcond_mask_icode (mode, mask_mode);
> +  rtx mask, rtx_op1, rtx_op2;
> +
> +  if (icode == CODE_FOR_nothing)
> +return 0;
> +
> +  mask = expand_normal (op0);
> +  rtx_op1 = expand_normal (op1);
> +  rtx_op2 = expand_normal (op2);
> +
> +  mask = force_reg (GET_MODE (mask), mask);
> +  rtx_op1 = force_reg (GET_MODE (rtx_op1), rtx_op1);
> +
> +  create_output_operand ([0], target, mode);
> +  create_input_operand ([1], rtx_op1, mode);
> +  create_input_operand ([2], rtx_op2, mode);
> +  create_input_operand ([3], mask, mask_mode);
> +  expand_insn (icode, 4, ops);
> +
> +  return ops[0].value;
> +}
> +
>  /* Generate insns for a VEC_COND_EXPR, given its TYPE and its
> three operands.  */
>
> @@ -5371,12 +5403,21 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, 
> tree op1, tree op2,
>  }
>else
>  {
> -  /* Fake op0 < 0.  */
>gcc_assert (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (op0)));
> -  op0a = op0;
> -  op0b = build_zero_cst (TREE_TYPE (op0));
> -  tcode = LT_EXPR;
> -  unsignedp = false;
> +  if (get_vcond_mask_icode (mode, TYPE_MODE (TREE_TYPE (op0)))
> + != CODE_FOR_nothing)
> +   return expand_vec_cond_mask_expr (vec_cond_type, op0, op1,
> + op2, target);
> +  /* Fake op0 < 0.  */
> +  else
> +   {
> + gcc_assert (GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (op0)))
> + == MODE_VECTOR_INT);
> + op0a = op0;
> + op0b = build_zero_cst (TREE_TYPE (op0));
> + tcode = LT_EXPR;
> + unsignedp = false;
> +   }
>  }
>cmp_op_mode = TYPE_MODE (TREE_TYPE

[PATCH 04/N] Fix big memory leak in ix86_valid_target_attribute_p

2015-11-12 Thread Martin Liška

Hello.

Following patch was a bit negotiated with Jakub and can save a huge amount of 
memory in cases
where target attributes are heavily utilized.

Can bootstrap and survives regression tests on x86_64-linux-pc.

Ready for trunk?
Thanks,
Martin
>From ebb7bd3cf513dc437622868eddbed6c8f725a67c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 11 Nov 2015 12:52:11 +0100
Subject: [PATCH] Fix big memory leak in ix86_valid_target_attribute_p

---
 gcc/config/i386/i386.c |  2 ++
 gcc/gcc.c  |  2 +-
 gcc/lto-wrapper.c  |  2 +-
 gcc/opts-common.c  |  1 +
 gcc/opts.c | 16 +++-
 gcc/opts.h |  1 +
 6 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b84a11d..1325cf0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6237,6 +6237,8 @@ ix86_valid_target_attribute_p (tree fndecl,
 	DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) = new_optimize;
 }
 
+  finalize_options_struct (_options);
+
   return ret;
 }
 
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 8bbf5be..87d1979 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -9915,7 +9915,7 @@ driver_get_configure_time_options (void (*cb) (const char *option,
   size_t i;
 
   obstack_init ();
-  gcc_obstack_init (_obstack);
+  init_opts_obstack ();
   n_switches = 0;
 
   for (i = 0; i < ARRAY_SIZE (option_default_specs); i++)
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 20e67ed..b9ac535 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1355,7 +1355,7 @@ main (int argc, char *argv[])
 {
   const char *p;
 
-  gcc_obstack_init (_obstack);
+  init_opts_obstack ();
 
   p = argv[0] + strlen (argv[0]);
   while (p != argv[0] && !IS_DIR_SEPARATOR (p[-1]))
diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index d9bf4d4..06e88b5 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -706,6 +706,7 @@ decode_cmdline_option (const char **argv, unsigned int lang_mask,
 /* Obstack for option strings.  */
 
 struct obstack opts_obstack;
+bool opts_obstack_initialized = false;
 
 /* Like libiberty concat, but allocate using opts_obstack.  */
 
diff --git a/gcc/opts.c b/gcc/opts.c
index 9a3fbb3..527e678 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -266,6 +266,20 @@ add_comma_separated_to_vector (void **pvec, const char *arg)
   *pvec = v;
 }
 
+static bool opts_obstack_initialized = false;
+
+/* Initialize opts_obstack if not initialized.  */
+
+void
+init_opts_obstack (void)
+{
+  if (!opts_obstack_initialized)
+{
+  opts_obstack_initialized = true;
+  gcc_obstack_init (_obstack);
+}
+}
+
 /* Initialize OPTS and OPTS_SET before using them in parsing options.  */
 
 void
@@ -273,7 +287,7 @@ init_options_struct (struct gcc_options *opts, struct gcc_options *opts_set)
 {
   size_t num_params = get_num_compiler_params ();
 
-  gcc_obstack_init (_obstack);
+  init_opts_obstack ();
 
   *opts = global_options_init;
 
diff --git a/gcc/opts.h b/gcc/opts.h
index 38b3837..2eb2d97 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -323,6 +323,7 @@ extern void decode_cmdline_options_to_array (unsigned int argc,
 extern void init_options_once (void);
 extern void init_options_struct (struct gcc_options *opts,
  struct gcc_options *opts_set);
+extern void init_opts_obstack (void);
 extern void finalize_options_struct (struct gcc_options *opts);
 extern void decode_cmdline_options_to_array_default_mask (unsigned int argc,
 			  const char **argv, 
-- 
2.6.2

[committed] gen-pass-instances.awk: Add len_of_call var in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch adds a variable len_of_call in handle_line in 
gen-pass-instances.awk.  It moves the use of the RLENGTH variable just 
after the related match call.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Add len_of_call var in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Add len_of_call variable.

---
 gcc/gen-pass-instances.awk | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 27e7a98..70b00b7 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -42,6 +42,7 @@ function handle_line()
 {
 	line = $0;
 
+	# Find call expression.
 	where = match(line, /NEXT_PASS \((.+)\)/);
 	if (where == 0)
 	{
@@ -49,9 +50,12 @@ function handle_line()
 		return;
 	}
 
+	# Length of the call expression.
+	len_of_call = RLENGTH;
+
 	len_of_start = length("NEXT_PASS (");
 	len_of_end = length(")");
-	len_of_pass_name = RLENGTH - (len_of_start + len_of_end);
+	len_of_pass_name = len_of_call - (len_of_start + len_of_end);
 	pass_starts_at = where + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
 	if (pass_name in pass_counts)

[committed] gen-pass-instances.awk: Add pass_num, prefix and postfix vars in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch adds new variables pass_num, prefix and postfix in 
handle_line in gen-pass-instances.awk.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Add pass_num, prefix and postfix vars in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Add pass_num, prefix and postfix
	vars.

---
 gcc/gen-pass-instances.awk | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 3d5e8b6..1aced74 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -61,17 +61,22 @@ function handle_line()
 	pass_starts_at = where + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
 
+	# Find prefix (until and including pass_name)
+	prefix = substr(line, 1, pass_starts_at + len_of_pass_name - 1)
+
+	# Find postfix (after pass_name)
+	postfix = substr(line, pass_starts_at + len_of_pass_name)
+
 	# Set pass_counts
 	if (pass_name in pass_counts)
 		pass_counts[pass_name]++;
 	else
 		pass_counts[pass_name] = 1;
 
+	pass_num = pass_counts[pass_name];
+
 	# Print call expression with extra pass_num argument
-	printf "%s, %s%s\n",
-		substr(line, 1, pass_starts_at + len_of_pass_name - 1),
-		pass_counts[pass_name],
-		substr(line, pass_starts_at + len_of_pass_name);
+	printf "%s, %s%s\n", prefix, pass_num, postfix;
 }
 
 { handle_line() }

Re: [PATCH] PR ada/66205 gnatbind generates invalid code when finalization is enabled in restricted runtime

2015-11-12 Thread Simon Wright

On 11 Nov 2015, at 19:43, Simon Wright  wrote:

> This situation arises, for example, with an embedded RTS that incorporates the
> Ada 2012 generalized container iterators.

I should add, this PR is the “other half” of PR ada/66242, which is fixed in 
GCC 6; so 
please can it be reviewed?

I didn’t make it plain that the comment I’ve put in the first hunk,

 --  For restricted run-time libraries (ZFP and Ravenscar) tasks
 --  are non-terminating, so we do not want finalization.

is lifted from the unpatched code at line 480, where it relates to the use of 
Configurable_Run_Time_On_Target for this purpose.

[committed] gen-pass-instances.awk: Rename var where to call_starts_at in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch renames the rather generic variable 'where' to the more 
specific 'call_starts_at' in handle_line in gen-pass-instances.awk.


Committed as to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Rename var where to call_starts_at in handle_line

2015-11-12  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Rename var where to
	call_starts_at.

---
 gcc/gen-pass-instances.awk | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index b10c26a..311273e 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -43,8 +43,8 @@ function handle_line()
 	line = $0;
 
 	# Find call expression.
-	where = match(line, /NEXT_PASS \((.+)\)/);
-	if (where == 0)
+	call_starts_at = match(line, /NEXT_PASS \((.+)\)/);
+	if (call_starts_at == 0)
 	{
 		print line;
 		return;
@@ -59,7 +59,7 @@ function handle_line()
 
 	# Find pass_name argument
 	len_of_pass_name = len_of_call - (len_of_start + len_of_close);
-	pass_starts_at = where + len_of_start;
+	pass_starts_at = call_starts_at + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
 
 	# Find call expression prefix (until and including called function)

[PATCH, i386]: Use ssememalign attribute value to reject insns with misaligned operands

2015-11-12 Thread Uros Bizjak

Hello!

Attached patch uses ssememalign attribute to reject insn combinations
where memory operands would be misaligned.

2015-11-12  Uros Bizjak  

* config/i386/i386.c (ix86_legitimate_combined_insn): Reject
combined insn if the alignment of vector mode memory operand
is less than ssememalign.

testsuite/ChangeLog:

2015-11-12  Uros Bizjak  

* gcc.target/i386/sse-1.c (swizzle): Assume that a is
aligned to 64 bits.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32}, committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 230213)
+++ config/i386/i386.c  (working copy)
@@ -7236,11 +7236,12 @@ ix86_legitimate_combined_insn (rtx_insn *insn)
  /* For pre-AVX disallow unaligned loads/stores where the
 instructions don't support it.  */
  if (!TARGET_AVX
- && VECTOR_MODE_P (GET_MODE (op))
- && misaligned_operand (op, GET_MODE (op)))
+ && VECTOR_MODE_P (mode)
+ && misaligned_operand (op, mode))
{
- int min_align = get_attr_ssememalign (insn);
- if (min_align == 0)
+ unsigned int min_align = get_attr_ssememalign (insn);
+ if (min_align == 0
+ || MEM_ALIGN (op) < min_align)
return false;
}
 
Index: testsuite/gcc.target/i386/sse-1.c
===
--- testsuite/gcc.target/i386/sse-1.c   (revision 230213)
+++ testsuite/gcc.target/i386/sse-1.c   (working copy)
@@ -14,8 +14,10 @@ typedef union
 void
 swizzle (const void *a, vector4_t * b, vector4_t * c)
 {
-  b->v = _mm_loadl_pi (b->v, (__m64 *) a);
-  c->v = _mm_loadl_pi (c->v, ((__m64 *) a) + 1);
+  __m64 *t = __builtin_assume_aligned (a, 64);
+
+  b->v = _mm_loadl_pi (b->v, t);
+  c->v = _mm_loadl_pi (c->v, t + 1);
 }
 
 /* While one legal rendering of each statement would be movaps;movlps;movaps,

[committed] gen-pass-instances.awk: Rename len_of_end to len_of_close in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch renames variable len_of_end to len_of_close in handle_line in 
gen-pass-instances.awk.


Committed to trunk as obvious.

Thanks,
- Tom
gen-pass-instances.awk: Rename len_of_end to len_of_close in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Rename len_of_end to
	len_of_close.

---
 gcc/gen-pass-instances.awk | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 70b00b7..7624959 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -54,8 +54,9 @@ function handle_line()
 	len_of_call = RLENGTH;
 
 	len_of_start = length("NEXT_PASS (");
-	len_of_end = length(")");
-	len_of_pass_name = len_of_call - (len_of_start + len_of_end);
+	len_of_close = length(")");
+
+	len_of_pass_name = len_of_call - (len_of_start + len_of_close);
 	pass_starts_at = where + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
 	if (pass_name in pass_counts)

[committed] gen-pass-instances.awk: Simplify init of postfix_starts_at in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch simplifies the initialization of postfix_starts_at in 
handle_line in gen-pass-instances.awk.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Simplify init of postfix_starts_at in handle_line

2015-11-12  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Simplify init of
	postfix_starts_at.

---
 gcc/gen-pass-instances.awk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 311273e..08d4a37 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -67,7 +67,7 @@ function handle_line()
 	prefix = substr(line, 1, prefix_len);
 
 	# Find call expression postfix
-	postfix_starts_at = pass_starts_at + len_of_pass_name + len_of_close;
+	postfix_starts_at = call_starts_at + len_of_call;
 	postfix = substr(line, postfix_starts_at);
 
 	# Set pass_counts

Re: [hsa 2/12] Modifications to libgomp proper

2015-11-12 Thread Jakub Jelinek

On Thu, Nov 05, 2015 at 10:54:42PM +0100, Martin Jambor wrote:
> The patch below contains all changes to libgomp files.  First, it adds
> a new constant identifying HSA devices and a structure that is shared
> between libgomp and the compiler when kernels from kernels are invoked
> via dynamic parallelism.
> 
> Second it modifies the GOMP_target_41 function so that it also can take
> kernel attributes (essentially the grid dimension) as a parameter and
> pass it on the HSA libgomp plugin.  Because we do want HSAIL
> generation to gracefully fail and use host fallback in that case, the
> same function calls the host implementation if it cannot map the
> requested function to an accelerated one or of a new callback
> can_run_func indicates there is a problem.
> 
> We need a new hook because we use it to check for linking errors which
> we cannot do when incrementally loading registered images.  And we
> want to handle linking errors, so that when we cannot emit HSAIL for a
> function called from a kernel (possibly in a different compilation
> unit), we also resort to host fallback.
> 
> Last but not least, the patch removes data remapping when the selected
> device is capable of sharing memory with the host.

The patch clearly is not against current trunk, there is no GOMP_target_41
function, the GOMP_target_ext function has extra arguments, etc.

> diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
> index 9c8b1fb..0ad42d2 100644
> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -876,7 +876,8 @@ struct gomp_device_descr
>void *(*dev2host_func) (int, void *, const void *, size_t);
>void *(*host2dev_func) (int, void *, const void *, size_t);
>void *(*dev2dev_func) (int, void *, const void *, size_t);
> -  void (*run_func) (int, void *, void *);
> +  void (*run_func) (int, void *, void *, const void *);

Adding arguments to existing plugin methods is a plugin ABI incompatible
change.  We now have:
  DLSYM (version);
  if (device->version_func () != GOMP_VERSION)
{
  err = "plugin version mismatch";
  goto fail;
}
so there is a way to deal with it, but you need to adjust all plugins.
See below anyway.

> --- a/libgomp/oacc-host.c
> +++ b/libgomp/oacc-host.c
> @@ -123,7 +123,8 @@ host_host2dev (int n __attribute__ ((unused)),
>  }
>  
>  static void
> -host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars)
> +host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars,
> +   const void* kern_launch __attribute__ ((unused)))

This is C, space before * not after it.
>  {
>void (*fn)(void *) = (void (*)(void *)) fn_ptr;

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1248,7 +1248,12 @@ gomp_get_target_fn_addr (struct gomp_device_descr 
> *devicep,
>splay_tree_key tgt_fn = splay_tree_lookup (>mem_map, );
>gomp_mutex_unlock (>lock);
>if (tgt_fn == NULL)
> - gomp_fatal ("Target function wasn't mapped");
> + {
> +   if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
> + return NULL;
> +   else
> + gomp_fatal ("Target function wasn't mapped");
> + }
>  
>return (void *) tgt_fn->tgt_offset;
>  }
> @@ -1276,6 +1281,7 @@ GOMP_target (int device, void (*fn) (void *), const 
> void *unused,
>  return gomp_target_fallback (fn, hostaddrs);
>  
>void *fn_addr = gomp_get_target_fn_addr (devicep, fn);
> +  assert (fn_addr);

I must say I really don't like putting asserts into libgomp, in production
it is after all not built with -D_NDEBUG.  But this shows a worse problem,
if you have GCC 5 compiled OpenMP code, of course there won't be HSA
offloaded copy, but if you try to run it on a box with HSA offloading
enabled, you can run into this assertion failure.
Supposedly the old APIs (GOMP_target, GOMP_target_update, GOMP_target_data)
should treat GOMP_OFFLOAD_CAP_SHARED_MEM capable devices as unconditional
device fallback?

> @@ -1297,7 +1304,7 @@ GOMP_target (int device, void (*fn) (void *), const 
> void *unused,
>  void
>  GOMP_target_41 (int device, void (*fn) (void *), size_t mapnum,
>   void **hostaddrs, size_t *sizes, unsigned short *kinds,
> - unsigned int flags, void **depend)
> + unsigned int flags, void **depend, const void *kernel_launch)

GOMP_target_ext has different arguments, you get the num_teams and
thread_limit clauses values in there already (if known at compile time or
before entering target region; 0 stands for implementation defined choice,
-1 for unknown before GOMP_target_ext).
Plus I must say I really don't like the addition of HSA specific argument
to the API, it is unclean and really doesn't scale, when somebody adds
support for another offloading target, would we add again another argument?
Can't use the same one, because one could have configured both HSA and that
other kind offloading at the same time and which one is picked would be only
a runtime decision, based on env vars of omp_set_default_device etc.

[committed] gen-pass-instances.awk: Unify semicolon use in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch unifies semicolon use in handle_line in gen-pass-instances.awk.

Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Unify semicolon use in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Unify semicolon use.

---
 gcc/gen-pass-instances.awk | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 7f33e8c..9eaac65 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -41,14 +41,14 @@ BEGIN {
 function handle_line()
 {
 	line = $0;
-	where = match(line, /NEXT_PASS \((.+)\)/)
+	where = match(line, /NEXT_PASS \((.+)\)/);
 	if (where != 0)
 	{
-		len_of_start = length("NEXT_PASS (")
-		len_of_end = length(")")
-		len_of_pass_name = RLENGTH - (len_of_start + len_of_end)
-		pass_starts_at = where + len_of_start
-		pass_name = substr(line, pass_starts_at, len_of_pass_name)
+		len_of_start = length("NEXT_PASS (");
+		len_of_end = length(")");
+		len_of_pass_name = RLENGTH - (len_of_start + len_of_end);
+		pass_starts_at = where + len_of_start;
+		pass_name = substr(line, pass_starts_at, len_of_pass_name);
 		if (pass_name in pass_counts)
 			pass_counts[pass_name]++;
 		else

gen-pass-instances.awk: Remove unused var in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch removes an unused variable from handle_line in 
gen-pass-instances.awk.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Remove unused var in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Remove unused var line_length.

---
 gcc/gen-pass-instances.awk | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index a0be6a1..7f33e8c 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -47,7 +47,6 @@ function handle_line()
 		len_of_start = length("NEXT_PASS (")
 		len_of_end = length(")")
 		len_of_pass_name = RLENGTH - (len_of_start + len_of_end)
-		line_length = length(line)
 		pass_starts_at = where + len_of_start
 		pass_name = substr(line, pass_starts_at, len_of_pass_name)
 		if (pass_name in pass_counts)

Re: [OpenACC] declare directive

2015-11-12 Thread Jakub Jelinek

On Wed, Nov 11, 2015 at 07:07:58PM -0600, James Norris wrote:
> +   oacc_declare_returns->remove (t);
> +
> +   if (oacc_declare_returns->elements () == 0)
> + {
> +   delete oacc_declare_returns;
> +   oacc_declare_returns = NULL;
> + }

Something for incremental patch:
1) might be nice to have some assertion that at the end of gimplify_body
   or so oacc_declare_returns is NULL
2) what happens if you refer to automatic variables of other functions
   (C or Fortran nested functions, maybe C++ lambdas); shall those be
   unmapped at the end of the (nested) function's body?

> @@ -5858,6 +5910,10 @@ omp_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl,
>flags |= GOVD_FIRSTPRIVATE;
>break;
>  case OMP_CLAUSE_DEFAULT_UNSPECIFIED:
> +  if (is_global_var (decl)
> +   && ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)

Please put this condition as cheapest first.  I'd also surround
it into (), just to make it clear that the bitwise & is intentional.
Perhaps () != 0.

> +   && device_resident_p (decl))
> + flags |= GOVD_MAP_TO_ONLY | GOVD_MAP;

> +   case GOMP_MAP_FROM:
> + kinds[i] = GOMP_MAP_FORCE_FROM;
> + GOACC_enter_exit_data (device, 1, [i], [i],
> +[i], 0, 0);

Wrong indentation.

Ok with those two changes and please think about the incremental stuff.

Jakub

RE: [PATCH][ARC] Fix ARC backend ICE on pr29921-2

2015-11-12 Thread Claudiu Zissulescu

Patch applied.

Thanks Joern,
Claudiu

> -Original Message-
> From: Joern Wolfgang Rennecke [mailto:g...@amylaar.uk]
> Sent: Wednesday, November 11, 2015 7:15 PM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: Francois Bedard
> Subject: Re: [PATCH][ARC] Fix ARC backend ICE on pr29921-2
> 
> 
> 
> On 11/11/15 15:22, Claudiu Zissulescu wrote:
> > Please find attached a patch that fixes the ARC backend ICE on pr29921-2
> test from gcc.dg (dg.exp).
> >
> > The patch will allow generating conditional move also outside expand
> scope. The error was triggered during if-conversion.
> >
> > Ok to apply?
> 
> OK.

[PATCH][ARM]Fix addsi3_compare_op2 pattern.

2015-11-12 Thread Renlin Li


Hi all,

This is a simply patch to adjust the assembly output for 
addsi3_compare_op2 rtx pattern in ARM backend.


According to the constraints, it's the second alternative which allows 
the second operand to be a constant.
The original pattern will trigger an ICE when the third alternative is 
chosen, and trying to output a constant while the second operand is a 
register.


This is triggered by my experimental backend changes. branch 5, 4.9 all 
have this problem.


arm-none-linux-gnueabihf bootstrap Okay, arm-none-eabi regression test Okay.

Okay to commit into trunk and backport to branch 5 and 4.9?

Regards,
Renlin Li

gcc/ChangeLog:

2015-11-12  Renlin Li  

* config/arm/arm.md (addsi3_compare_op2): Make the order of
assembly pattern consistent with constraint order.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8ebb1bf..73c3088 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -747,8 +747,8 @@
   "TARGET_32BIT"
   "@
adds%?\\t%0, %1, %2
-   adds%?\\t%0, %1, %2
-   subs%?\\t%0, %1, #%n2"
+   subs%?\\t%0, %1, #%n2
+   adds%?\\t%0, %1, %2"
   [(set_attr "conds" "set")
(set_attr "type" "alus_imm,alus_imm,alus_sreg")]
 )

Re: [PATCH] Fix PR ipa/68035 (v2)

2015-11-12 Thread Martin Liška

Hello.

I'm sending reworked version of the patch, where I renamed 'sem_item::hash' to 
'm_hash'
and wrapped all usages with 'get_hash'. Apart from that, a new member function 
'set_hash'
is utilized for changing the hash value. Hope it's easier for understanding.

Patch can survive regression tests and bootstraps on x86_64-linux-pc.

Ready for trunk?
Thanks,
Martin
>From 29be4ad798d73245715f53fe971a17664b69eeb8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 5 Nov 2015 18:31:31 +0100
Subject: [PATCH] Fix PR ipa/68035

gcc/ChangeLog:

2015-11-12  Martin Liska  

	PR ipa/68035
	* ipa-icf.c (void sem_item::set_hash): New function.
	(sem_function::get_hash): Use renamed m_hash member variable.
	(sem_item::update_hash_by_addr_refs): Utilize get_hash.
	(sem_item::update_hash_by_local_refs): Likewise.
	(sem_variable::get_hash): Use renamed m_hash member variable.
	(sem_item_optimizer::update_hash_by_addr_refs): Utilize get_hash.
	(sem_item_optimizer::build_hash_based_classes): Utilize set_hash.
	(sem_item_optimizer::build_graph): As the hash value of an item
	is lazy initialized, force the calculation.
	* ipa-icf.h (set_hash): Declare new function and rename hash member
	variable to m_hash.

gcc/testsuite/ChangeLog:

2015-11-12  Martin Liska  

	* gcc.dg/ipa/pr68035.c: New test.
---
 gcc/ipa-icf.c  |  46 +---
 gcc/ipa-icf.h  |   9 ++--
 gcc/testsuite/gcc.dg/ipa/pr68035.c | 108 +
 3 files changed, 141 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr68035.c

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 7bb3af5..b6a97c3 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -140,7 +140,7 @@ sem_usage_pair::sem_usage_pair (sem_item *_item, unsigned int _index):
for bitmap memory allocation.  */
 
 sem_item::sem_item (sem_item_type _type,
-		bitmap_obstack *stack): type(_type), hash(0)
+		bitmap_obstack *stack): type (_type), m_hash (0)
 {
   setup (stack);
 }
@@ -151,7 +151,7 @@ sem_item::sem_item (sem_item_type _type,
 
 sem_item::sem_item (sem_item_type _type, symtab_node *_node,
 		hashval_t _hash, bitmap_obstack *stack): type(_type),
-  node (_node), hash (_hash)
+  node (_node), m_hash (_hash)
 {
   decl = node->decl;
   setup (stack);
@@ -227,6 +227,11 @@ sem_item::target_supports_symbol_aliases_p (void)
 #endif
 }
 
+void sem_item::set_hash (hashval_t hash)
+{
+  m_hash = hash;
+}
+
 /* Semantic function constructor that uses STACK as bitmap memory stack.  */
 
 sem_function::sem_function (bitmap_obstack *stack): sem_item (FUNC, stack),
@@ -274,7 +279,7 @@ sem_function::get_bb_hash (const sem_bb *basic_block)
 hashval_t
 sem_function::get_hash (void)
 {
-  if(!hash)
+  if (!m_hash)
 {
   inchash::hash hstate;
   hstate.add_int (177454); /* Random number for function type.  */
@@ -289,7 +294,6 @@ sem_function::get_hash (void)
   for (unsigned i = 0; i < bb_sizes.length (); i++)
 	hstate.add_int (bb_sizes[i]);
 
-
   /* Add common features of declaration itself.  */
   if (DECL_FUNCTION_SPECIFIC_TARGET (decl))
 hstate.add_wide_int
@@ -301,10 +305,10 @@ sem_function::get_hash (void)
   hstate.add_flag (DECL_CXX_CONSTRUCTOR_P (decl));
   hstate.add_flag (DECL_CXX_DESTRUCTOR_P (decl));
 
-  hash = hstate.end ();
+  set_hash (hstate.end ());
 }
 
-  return hash;
+  return m_hash;
 }
 
 /* Return ture if A1 and A2 represent equivalent function attribute lists.
@@ -800,7 +804,7 @@ sem_item::update_hash_by_addr_refs (hash_map  _symtab_node_map)
 {
   ipa_ref* ref;
-  inchash::hash hstate (hash);
+  inchash::hash hstate (get_hash ());
 
   for (unsigned i = 0; node->iterate_reference (i, ref); i++)
 {
@@ -823,7 +827,7 @@ sem_item::update_hash_by_addr_refs (hash_map  _symtab_node_map)
 {
   ipa_ref* ref;
-  inchash::hash state (hash);
+  inchash::hash state (get_hash ());
 
   for (unsigned j = 0; node->iterate_reference (j, ref); j++)
 {
   sem_item **result = m_symtab_node_map.get (ref->referring);
   if (result)
-	state.merge_hash ((*result)->hash);
+	state.merge_hash ((*result)->get_hash ());
 }
 
   if (type == FUNC)
@@ -851,7 +855,7 @@ sem_item::update_hash_by_local_refs (hash_map caller);
 	  if (result)
-	state.merge_hash ((*result)->hash);
+	state.merge_hash ((*result)->get_hash ());
 	}
 }
 
@@ -2099,8 +2103,8 @@ sem_variable::parse (varpool_node *node, bitmap_obstack *stack)
 hashval_t
 sem_variable::get_hash (void)
 {
-  if (hash)
-return hash;
+  if (m_hash)
+return m_hash;
 
   /* All WPA streamed in symbols should have their hashes computed at compile
  time.  At this point, the constructor may not be in memory at all.
@@ -2113,9 +2117,9 @@ sem_variable::get_hash (void)
   if (DECL_SIZE (decl) && tree_fits_shwi_p (DECL_SIZE (decl)))
 hstate.add_wide_int (tree_to_shwi (DECL_SIZE (decl)));
   add_expr (ctor, hstate);
-  hash =

Re: [PATCH] Fix PR ipa/68035

2015-11-12 Thread Martin Liška

On 11/06/2015 05:43 PM, Jan Hubicka wrote:
>> Hello.
>>
>> Following patch triggers hash calculation of items (functions and variables)
>> in situations where LTO mode is not utilized.
>>
>> Patch survives regression tests and bootstraps on x86_64-linux-pc.
>>
>> Ready for trunk?
>> Thanks,
>> Martin
> 
>> >From 62266e21a89777c6dbd680f7c87f15abe603c024 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Thu, 5 Nov 2015 18:31:31 +0100
>> Subject: [PATCH] Fix PR ipa/68035
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2015-11-05  Martin Liska  
>>
>>  * gcc.dg/ipa/pr68035.c: New test.
>>
>> gcc/ChangeLog:
>>
>> 2015-11-05  Martin Liska  
>>
>>  PR ipa/68035
>>  * ipa-icf.c (sem_item_optimizer::build_graph): Force building
>>  of a hash value for an item if we are not running in LTO mode.
>> ---
>>  gcc/ipa-icf.c  |   4 ++
>>  gcc/testsuite/gcc.dg/ipa/pr68035.c | 108 
>> +
>>  2 files changed, 112 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr68035.c
>>
>> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
>> index 7bb3af5..09c42a1 100644
>> --- a/gcc/ipa-icf.c
>> +++ b/gcc/ipa-icf.c
>> @@ -2744,6 +2744,10 @@ sem_item_optimizer::build_graph (void)
>>  {
>>sem_item *item = m_items[i];
>>m_symtab_node_map.put (item->node, item);
>> +
>> +  /* Initialize hash values if we are not in LTO mode.  */
>> +  if (!in_lto_p)
>> +item->get_hash ();
>>  }
> 
> Hmm, what is the difference to the LTO mode here. I would have expected that 
> all the items
> was analyzed in both paths?

Difference is that in case of the LTO mode, the hash value is read from 
streamed LTO file.
On the other hand, in classic compilation mode we have to force the calculation 
as a hash value
is computed lazily.

Please take a look at just sent suggested patch.

Thanks,
Martin

> 
> Honza
>

Re: [PATCH] PR ada/66205 gnatbind generates invalid code when finalization is enabled in restricted runtime

2015-11-12 Thread Arnaud Charlet

> > This situation arises, for example, with an embedded RTS that
> > incorporates the
> > Ada 2012 generalized container iterators.
> 
> I should add, this PR is the ???other half??? of PR ada/66242, which is fixed
> in GCC 6; so please can it be reviewed?

The proper patch for PR ada/66242 hasn't been committed yet (it's pending),
so I'd rather review the situation once PR ada/66242 is dealt with.

I'm not convinced at all that your patch is the way to go, so I'd rather
consider it only after PR ada/66242 is solved properly.

Arno

[PATCH, PR68286] Fix vector comparison expand

2015-11-12 Thread Ilya Enkovich

Hi,

My vector comparison patches broken expand of vector comparison on targets 
which don't have new comparison patterns but support VEC_COND_EXPR.  This 
happens because it's not checked vector comparison may be expanded as a 
comparison.  This patch fixes it.  Bootstrapped and regtested on 
powerpc64le-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  

* expr.c (do_store_flag): Expand vector comparison as
VEC_COND_EXPR if vector comparison is not supported
by target.

gcc/testsuite/

2015-11-12  Ilya Enkovich  

* gcc.dg/pr68286.c: New test.


diff --git a/gcc/expr.c b/gcc/expr.c
index 03936ee..bd43dc4 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11128,7 +11128,8 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
-  if (VECTOR_BOOLEAN_TYPE_P (ops->type))
+  if (VECTOR_BOOLEAN_TYPE_P (ops->type)
+ && expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type))
return expand_vec_cmp_expr (ops->type, ifexp, target);
   else
{
diff --git a/gcc/testsuite/gcc.dg/pr68286.c b/gcc/testsuite/gcc.dg/pr68286.c
new file mode 100644
index 000..d0392e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68286.c
@@ -0,0 +1,17 @@
+/* PR target/68286 */
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, c;
+int fn1 ()
+{
+  int d[] = {0};
+  for (; c; c++)
+{
+  float e = c;
+  if (e)
+d[0]++;
+}
+  b = d[0];
+  return a;
+}

Re: Recent patch craters vector tests on powerpc64le-linux-gnu

2015-11-12 Thread Ilya Enkovich

2015-11-12 12:48 GMT+03:00 James Greenhalgh :
> On Wed, Nov 11, 2015 at 05:12:29PM -0600, Bill Schmidt wrote:
>> Hi Ilya,
>>
>> The patch committed as r230098 has caused a number of ICEs on
>> powerpc64le-linux-gnu.
>
> And arm-none-linux-gnueabihf, and aarch64-none-linux-gnu.
>
>> Could you please either revert the patch or fix these issues?
>
> Thanks,
> James
>

Sorry for the breakage. I sent a patch to fix it.

https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01467.html

Thanks,
Ilya

Re: [Patch ARM] Switch ARM to unified asm.

2015-11-12 Thread Ramana Radhakrishnan

On Thu, Nov 12, 2015 at 9:21 AM, Christian Bruel  wrote:
> Hi Ramana,
>
> On 11/10/2015 12:48 PM, Ramana Radhakrishnan wrote:
>>
>> [Resending as I managed to muck this up with my mail client]
>>
>> Hi,
>>
>> I held off committing a previous version of this patch that I posted in
>> July to be nice to folks backporting fixes and to watch for any objections
>> to move the ARM backend completely over into the unified assembler.
>>
>> The patch does the following.
>>
>> * The compiler now generates code in all ISA modes in unified asm.
>> * We've had unified asm only for the last 10 years, ever since the first
>> Thumb2 support was put in, the disassembler generates output in unified
>> assembler, while the compiler output is always in divided syntax for ARM
>> state.
>> * This means patterns get simpler not having to worry about the position
>> of the condition in a conditional instruction. For example we now
>> consistently use
>> a. ldrbeq rather than ldreqb
>> b. movseq rather than moveqs
>> c. Or indeed the appropriate push / pop instructions whereever
>> appropriate.
>>
>>
>> The compiler behaviour has not changed in terms of what it does with
>> inline assembler, that still remains in divided syntax and over time we need
>> to move all of this over to unified syntax if we can do so as all the
>> official documentation is now in terms of unified asm. I've been carrying
>> this in my tree for quite a while and am reasonably happy that it is stable.
>> I will watch out for any fallout in the coming weeks with this but it is
>> better to take this now rather than later given we are hitting the end of
>> stage1.
>>
>> Tested on arm-none-eabi - applied to trunk.
>>
>>
>
> I see a failure with an outdated check for the unified assembly. OK to fix ?
>

OK thanks.

Ramana
>
>

[committed] gen-pass-instances.awk: Simplify match regexp in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch simplifies the match regexp in handle_line in 
gen-pass-instances.awk.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Simplify match regexp in handle_line

2015-11-12  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Simplify match regexp.

---
 gcc/gen-pass-instances.awk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 08d4a37..cbfaa86 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -43,7 +43,7 @@ function handle_line()
 	line = $0;
 
 	# Find call expression.
-	call_starts_at = match(line, /NEXT_PASS \((.+)\)/);
+	call_starts_at = match(line, /NEXT_PASS \(.+\)/);
 	if (call_starts_at == 0)
 	{
 		print line;

[committed] gen-pass-instances.awk: Add comments in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch adds some comments in handle_line in gen-pass-instances.awk.

Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Add comments in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Add comments.

---
 gcc/gen-pass-instances.awk | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 7624959..3d5e8b6 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -56,13 +56,18 @@ function handle_line()
 	len_of_start = length("NEXT_PASS (");
 	len_of_close = length(")");
 
+	# Find pass_name argument
 	len_of_pass_name = len_of_call - (len_of_start + len_of_close);
 	pass_starts_at = where + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
+
+	# Set pass_counts
 	if (pass_name in pass_counts)
 		pass_counts[pass_name]++;
 	else
 		pass_counts[pass_name] = 1;
+
+	# Print call expression with extra pass_num argument
 	printf "%s, %s%s\n",
 		substr(line, 1, pass_starts_at + len_of_pass_name - 1),
 		pass_counts[pass_name],

[committed] gen-pass-instances.awk: Make print command clearer in handle_line

2015-11-12 Thread Tom de Vries


Hi,

this patch modifies the prefix and postfix expressions in handle_line 
gen-pass-instances.awk, such that the printf command now lists all the 
NEXT_PASS call arguments, and surrounds them with parentheses.


Committed to trunk as trivial.

Thanks,
- Tom
gen-pass-instances.awk: Make print command clearer in handle_line

2015-11-11  Tom de Vries  

	* gen-pass-instances.awk (handle_line): Print parentheses and pass_name
	explicitly.

---
 gcc/gen-pass-instances.awk | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index 1aced74..b10c26a 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -54,6 +54,7 @@ function handle_line()
 	len_of_call = RLENGTH;
 
 	len_of_start = length("NEXT_PASS (");
+	len_of_open = length("(");
 	len_of_close = length(")");
 
 	# Find pass_name argument
@@ -61,11 +62,13 @@ function handle_line()
 	pass_starts_at = where + len_of_start;
 	pass_name = substr(line, pass_starts_at, len_of_pass_name);
 
-	# Find prefix (until and including pass_name)
-	prefix = substr(line, 1, pass_starts_at + len_of_pass_name - 1)
+	# Find call expression prefix (until and including called function)
+	prefix_len = pass_starts_at - 1 - len_of_open;
+	prefix = substr(line, 1, prefix_len);
 
-	# Find postfix (after pass_name)
-	postfix = substr(line, pass_starts_at + len_of_pass_name)
+	# Find call expression postfix
+	postfix_starts_at = pass_starts_at + len_of_pass_name + len_of_close;
+	postfix = substr(line, postfix_starts_at);
 
 	# Set pass_counts
 	if (pass_name in pass_counts)
@@ -76,7 +79,7 @@ function handle_line()
 	pass_num = pass_counts[pass_name];
 
 	# Print call expression with extra pass_num argument
-	printf "%s, %s%s\n", prefix, pass_num, postfix;
+	printf "%s(%s, %s)%s\n", prefix, pass_name, pass_num, postfix;
 }
 
 { handle_line() }

Re: [PATCH][ARM]Fix addsi3_compare_op2 pattern.

2015-11-12 Thread Kyrill Tkachov


Hi Renlin,

On 12/11/15 09:29, Renlin Li wrote:

Hi all,

This is a simply patch to adjust the assembly output for addsi3_compare_op2 rtx 
pattern in ARM backend.

According to the constraints, it's the second alternative which allows the 
second operand to be a constant.
The original pattern will trigger an ICE when the third alternative is chosen, 
and trying to output a constant while the second operand is a register.

This is triggered by my experimental backend changes. branch 5, 4.9 all have 
this problem.

arm-none-linux-gnueabihf bootstrap Okay, arm-none-eabi regression test Okay.

Okay to commit into trunk and backport to branch 5 and 4.9?

Regards,
Renlin Li

gcc/ChangeLog:

2015-11-12  Renlin Li  

* config/arm/arm.md (addsi3_compare_op2): Make the order of
assembly pattern consistent with constraint order.


Yes, this is ok. I think the order of the alternatives is obviously wrong.

For context, this is the whole pattern:
(define_insn "*addsi3_compare_op2"
  [(set (reg:CC_C CC_REGNUM)
(compare:CC_C
 (plus:SI (match_operand:SI 1 "s_register_operand" "r,r,r")
  (match_operand:SI 2 "arm_add_operand" "I,L,r"))
 (match_dup 2)))
   (set (match_operand:SI 0 "s_register_operand" "=r,r,r")
(plus:SI (match_dup 1) (match_dup 2)))]
  "TARGET_32BIT"
  "@
   add%.\\t%0, %1, %2
   add%.\\t%0, %1, %2
   sub%.\\t%0, %1, #%n2"
  [(set_attr "conds" "set")
   (set_attr "type" "alus_imm,alus_imm,alus_sreg")]
)

Thanks,
Kyrill

[patch] Fix doxygen @file comment in libstdc++ header

2015-11-12 Thread Jonathan Wakely


A trivial patch, I didn't edit the @file when I moved this file to the
new bits sub-directory.

Committed as obvious.


commit 1229ad46adf4d9a74b3da4e354120ebaa1be8eb1
Author: Jonathan Wakely 
Date:   Thu Nov 12 10:07:08 2015 +

	* include/experimental/bits/string_view.tcc: Fix doxygen @file.

diff --git a/libstdc++-v3/include/experimental/bits/string_view.tcc b/libstdc++-v3/include/experimental/bits/string_view.tcc
index 75a34f9..0eb4f70 100644
--- a/libstdc++-v3/include/experimental/bits/string_view.tcc
+++ b/libstdc++-v3/include/experimental/bits/string_view.tcc
@@ -22,7 +22,7 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
-/** @file experimental/string_view.tcc
+/** @file experimental/bits/string_view.tcc
  *  This is an internal header file, included by other library headers.
  *  Do not attempt to use it directly. @headername{string_view}
  */

[patch] update locale support fro FreeBSD

2015-11-12 Thread Andreas Tobler


All,

with the work from Jennifer Yao and John Marino we can now update the 
locale support on FreeBSD to the level of DragonFly.


Results of this work can be found on the results list.

Here my small addendum to make it work on FreeBSD.

Is this ok for trunk? (Given that the work from Jennifer and John are 
committed before stage3?)


TIA,

Andreas

2015-11-12  Andreas Tobler  

* acinclude.m4 (GLIBCXX_ENABLE_CLOCALE): Change locale implementation
from darwin to DragonFly.
* configure: Regenerate.
* config/os/bsd/freebsd/ctype_configure_char.cc: Improve locale
support, do it the same as DragonFly.
* config/os/bsd/freebsd/os_defines.h: Add fine grained C99 defines.
Index: acinclude.m4
===
--- acinclude.m4(revision 230195)
+++ acinclude.m4(working copy)
@@ -2032,10 +2032,10 @@
   linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
enable_clocale_flag=gnu
;;
-  darwin* | freebsd*)
+  darwin*)
enable_clocale_flag=darwin
;;
-  dragonfly*)
+  dragonfly* | freebsd*)
enable_clocale_flag=dragonfly
;;
   openbsd*)
@@ -2114,7 +2114,7 @@
   CLOCALE_INTERNAL_H=config/locale/generic/c++locale_internal.h
   ;;
 darwin)
-  AC_MSG_RESULT(darwin or freebsd)
+  AC_MSG_RESULT(darwin)
 
   CLOCALE_H=config/locale/generic/c_locale.h
   CLOCALE_CC=config/locale/generic/c_locale.cc
@@ -2131,7 +2131,7 @@
   ;;
 
 dragonfly)
-  AC_MSG_RESULT(dragonfly)
+  AC_MSG_RESULT(dragonfly or freebsd)
 
   CLOCALE_H=config/locale/dragonfly/c_locale.h
   CLOCALE_CC=config/locale/dragonfly/c_locale.cc
Index: config/os/bsd/freebsd/ctype_configure_char.cc
===
--- config/os/bsd/freebsd/ctype_configure_char.cc   (revision 230195)
+++ config/os/bsd/freebsd/ctype_configure_char.cc   (working copy)
@@ -1,6 +1,6 @@
 // Locale support -*- C++ -*-
 
-// Copyright (C) 2011-2015 Free Software Foundation, Inc.
+// Copyright (C) 2014-2015 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -37,32 +37,60 @@
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 // Information as gleaned from /usr/include/ctype.h
-  
+
   const ctype_base::mask*
   ctype::classic_table() throw()
-  { return 0; }
+  { return NULL; }
 
-  ctype::ctype(__c_locale, const mask* __table, bool __del, 
-size_t __refs) 
-  : facet(__refs), _M_del(__table != 0 && __del), 
-  _M_toupper(NULL), _M_tolower(NULL), 
-  _M_table(__table ? __table : classic_table()) 
-  { 
+  ctype::ctype(__c_locale, const mask* __table, bool __del,
+size_t __refs)
+  : facet(__refs), _M_c_locale_ctype(_S_get_c_locale()),
+  _M_del(__table != 0 && __del), _M_widen_ok(0), _M_narrow_ok(0)
+  {
+char* __old = setlocale(LC_CTYPE, NULL);
+char* __sav = NULL;
+if (strcmp(__old, "C"))
+  {
+   const size_t __len = strlen(__old) + 1;
+   __sav = new char[__len];
+   memcpy(__sav, __old, __len);
+   setlocale(LC_CTYPE, "C");
+  }
+_M_toupper = NULL;
+_M_tolower = NULL;
+_M_table = __table ? __table : classic_table();
+if (__sav)
+  {
+   setlocale(LC_CTYPE, __sav);
+   delete [] __sav;
+  }
 memset(_M_widen, 0, sizeof(_M_widen));
-_M_widen_ok = 0;
 memset(_M_narrow, 0, sizeof(_M_narrow));
-_M_narrow_ok = 0;
   }
 
-  ctype::ctype(const mask* __table, bool __del, size_t __refs) 
-  : facet(__refs), _M_del(__table != 0 && __del), 
-  _M_toupper(NULL), _M_tolower(NULL), 
-  _M_table(__table ? __table : classic_table()) 
-  { 
+  ctype::ctype(const mask* __table, bool __del, size_t __refs)
+  : facet(__refs), _M_c_locale_ctype(_S_get_c_locale()),
+  _M_del(__table != 0 && __del), _M_widen_ok(0), _M_narrow_ok(0)
+  {
+char* __old = setlocale(LC_CTYPE, NULL);
+char* __sav = NULL;
+if (strcmp(__old, "C"))
+  {
+   const size_t __len = strlen(__old) + 1;
+   __sav = new char[__len];
+   memcpy(__sav, __old, __len);
+   setlocale(LC_CTYPE, "C");
+  }
+_M_toupper = NULL;
+_M_tolower = NULL;
+_M_table = __table ? __table : classic_table();
+if (__sav)
+  {
+   setlocale(LC_CTYPE, __sav);
+   delete [] __sav;
+  }
 memset(_M_widen, 0, sizeof(_M_widen));
-_M_widen_ok = 0;
 memset(_M_narrow, 0, sizeof(_M_narrow));
-_M_narrow_ok = 0;
   }
 
   char
@@ -84,7 +112,7 @@
   ctype::do_tolower(char __c) const
   { return ::tolower((int) __c); }
 
-  const char* 
+  const char*
   ctype::do_tolower(char* __low, const char* __high) const
   {
 while (__low < __high)
Index: config/os/bsd/freebsd/os_defines.h
===
---

Re: [PATCH] Enable libmpx by default on supported target

2015-11-12 Thread Jeff Law


On 11/12/2015 08:34 AM, Ilya Enkovich wrote:

Hi,

libmpx was added close to release date and therefore was disabled by default 
for all targets.  This patch enables it by default for supported targets.  Is 
it OK for trunk?

Thanks,
Ilya
--
2015-11-12  Tsvetkova Alexandra  

* configure.ac: Enable libmpx by default.
* configure: Regenerated.

OK.
jeff

[PATCH] Fix unused variable.

2015-11-12 Thread James Norris


Hi,

As a result of an unused variable from my patch
of today, it broke bootstrap. Dominique kindly
pointed this out. Thank you.

Committed to trunk as obvious.

Jim

Index: gcc/cp/ChangeLog
===
--- gcc/cp/ChangeLog	(revision 230275)
+++ gcc/cp/ChangeLog	(working copy)
@@ -1,4 +1,8 @@
 2015-11-12  James Norris  
+
+	* parser.c (cp_parser_oacc_declare): Remove unused.
+
+2015-11-12  James Norris  
 	Joseph Myers  
 
 	* parser.c (cp_parser_omp_clause_name): Handle 'device_resident'
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 230275)
+++ gcc/cp/parser.c	(working copy)
@@ -34562,7 +34562,7 @@
 static tree
 cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 {
-  tree clauses, stmt, t;
+  tree clauses, stmt;
   bool error = false;
 
   clauses = cp_parser_oacc_all_clauses (parser, OACC_DECLARE_CLAUSE_MASK,

Re: [PATCH] Make disabled-optimization warning more informative; increase default max-gcse-memory

2015-11-12 Thread Bradley Lucier


On 11/12/2015 12:08 PM, Bradley Lucier wrote:

On 11/12/2015 11:57 AM, Bernd Schmidt wrote:

The expanded warning allowed me to see how much memory really was needed
to apply gcse to some of my routines, and 128MB fixes my problem.  The
limit has been 50MB for over 10 years, I think we can up it a bit now.
  {
+  unsigned int memory_request = n_basic_blocks_for_fn (cfun)
+* SBITMAP_SET_SIZE (max_reg_num ())
+* sizeof (SBITMAP_ELT_TYPE);
+


Formatting (wrap in parens to get it indented).


Fixed.



Otherwise ok.


Thanks.

Here's the reformatted patch (but without retesting).


To check reformatting, bootstrapped on x86_64-pc-linux-gnu with 
--enable-languages=c,c++ --disable-multilib.


Brad

Re: [rs6000] Rotate stack checking loop

2015-11-12 Thread David Edelsohn

On Thu, Nov 12, 2015 at 4:51 PM, Eric Botcazou  wrote:
> Hi,
>
> this patch rotates the loop generated in the prologue to do stack checking
> when -fstack-check is specified, thereby saving one branch instruction.  It
> was initially implemented as a WHILE loop to match the generic implementation
> but can be turned into a DO-WHILE loop because the amount of stack to be
> checked is known at compile time (since it's the static part of the frame).
>
> Tested on PowerPC/Linux, OK for the mainline?
>
>
> 2015-11-12  Eric Botcazou  
>
> * config/rs6000/rs6000.c (rs6000_emit_probe_stack_rang): Adjust.
> (output_probe_stack_range): Rotate the loop and simplify.

Okay.

Thanks, David

Re: [PATCH] [ARM] neon-testgen.ml typo

2015-11-12 Thread Christophe Lyon

On 6 November 2015 at 21:29, Christophe Lyon  wrote:
> On 4 November 2015 at 13:16, Ramana Radhakrishnan
>  wrote:
>> On Fri, Oct 30, 2015 at 2:42 PM, Christophe Lyon
>>  wrote:
>>> On 30 October 2015 at 15:33, Ramana Radhakrishnan
>>>  wrote:


 On 29/10/15 17:23, Jim Wilson wrote:
> I noticed a comment typo in this file while using grep to look for
> other stuff.  The typo is easy to fix.
>
> I tried running neon-testgen.ml to verify, but it is apparently no
> longer valid ocaml, as it doesn't work with the ocamlc 4.01.0 I have
> on Ubuntu 14.04.  I get a syntax error.  Someone who knows ocaml will
> have to fix this.  Meanwhile, the patch to fix the typo should still
> be OK, as this is a separate problem.
>
> Jim
>

 This is OK.

 I'd really like neon-testgen.ml and the tests in gcc.target/arm/neon to be 
 removed if all the intrinsics are now tested from Christophe's work in 
 getting his advsimd tests integrated. Where are we on that ?

>>>
>>> The tests I added cover all ARMv7 intrinsics. I have converted all my tests.
>>
>> Good and thank you for doing that.
>>
>>>
>>> There were a few additions to support some aarch64 specific intrinsics.
>>>
>>> However, most of the tests in gcc.target/arm/neon contain scan-asm
>>> directives which mine don't.
>>> My tests do check functionality (they are executable, comparing their
>>> results against expected values).
>>
>> I don't think the scan-asm in those tests gives you anything useful at
>> O0 with undefined behaviour in the testcases. In any case for the more
>> esoteric intrinsics (i.e. the ones that do saturation etc..) having
>> the run time tests is a better test. I do not see this as being useful
>> any more in terms of the testing coverage this provides.
>
> Ideally we'd need to add more tests for the new armv8 intrinsics
>
>> A patch to remove neon-testgen.ml and the tests in gcc.target/arm/neon
>> is pre-approved.
>>
> OK I'l give it a look, but that will after e/o stage1 probably.
>

I have just removed them all, as r230274.

ChangeLog is attached.

Christophe.

>>
>> regards
>> Ramana
>>
>>
>>
>>>
>>> Christophe.
>>>
 regards
 Ramana
[ARM] Remove neon-testgen.ml and generated tests.

gcc/ChangeLog:

2015-11-12  Christophe Lyon  

* config/arm/neon-testgen.ml: Remove.

gcc/testsuite/ChangeLog:

2015-11-12  Christophe Lyon  

* gcc.target/arm/neon/vRaddhns16.c: Remove.
* gcc.target/arm/neon/vRaddhns32.c: Remove.
* gcc.target/arm/neon/vRaddhns64.c: Remove.
* gcc.target/arm/neon/vRaddhnu16.c: Remove.
* gcc.target/arm/neon/vRaddhnu32.c: Remove.
* gcc.target/arm/neon/vRaddhnu64.c: Remove.
* gcc.target/arm/neon/vRhaddQs16.c: Remove.
* gcc.target/arm/neon/vRhaddQs32.c: Remove.
* gcc.target/arm/neon/vRhaddQs8.c: Remove.
* gcc.target/arm/neon/vRhaddQu16.c: Remove.
* gcc.target/arm/neon/vRhaddQu32.c: Remove.
* gcc.target/arm/neon/vRhaddQu8.c: Remove.
* gcc.target/arm/neon/vRhadds16.c: Remove.
* gcc.target/arm/neon/vRhadds32.c: Remove.
* gcc.target/arm/neon/vRhadds8.c: Remove.
* gcc.target/arm/neon/vRhaddu16.c: Remove.
* gcc.target/arm/neon/vRhaddu32.c: Remove.
* gcc.target/arm/neon/vRhaddu8.c: Remove.
* gcc.target/arm/neon/vRshlQs16.c: Remove.
* gcc.target/arm/neon/vRshlQs32.c: Remove.
* gcc.target/arm/neon/vRshlQs64.c: Remove.
* gcc.target/arm/neon/vRshlQs8.c: Remove.
* gcc.target/arm/neon/vRshlQu16.c: Remove.
* gcc.target/arm/neon/vRshlQu32.c: Remove.
* gcc.target/arm/neon/vRshlQu64.c: Remove.
* gcc.target/arm/neon/vRshlQu8.c: Remove.
* gcc.target/arm/neon/vRshls16.c: Remove.
* gcc.target/arm/neon/vRshls32.c: Remove.
* gcc.target/arm/neon/vRshls64.c: Remove.
* gcc.target/arm/neon/vRshls8.c: Remove.
* gcc.target/arm/neon/vRshlu16.c: Remove.
* gcc.target/arm/neon/vRshlu32.c: Remove.
* gcc.target/arm/neon/vRshlu64.c: Remove.
* gcc.target/arm/neon/vRshlu8.c: Remove.
* gcc.target/arm/neon/vRshrQ_ns16.c: Remove.
* gcc.target/arm/neon/vRshrQ_ns32.c: Remove.
* gcc.target/arm/neon/vRshrQ_ns64.c: Remove.
* gcc.target/arm/neon/vRshrQ_ns8.c: Remove.
* gcc.target/arm/neon/vRshrQ_nu16.c: Remove.
* gcc.target/arm/neon/vRshrQ_nu32.c: Remove.
* gcc.target/arm/neon/vRshrQ_nu64.c: Remove.
* gcc.target/arm/neon/vRshrQ_nu8.c: Remove.
* gcc.target/arm/neon/vRshr_ns16.c: Remove.
* gcc.target/arm/neon/vRshr_ns32.c: Remove.
* gcc.target/arm/neon/vRshr_ns64.c: Remove.
* gcc.target/arm/neon/vRshr_ns8.c: Remove.
*

Re: [PATCH] Fix ICE for masked store of boolean value

2015-11-12 Thread Jeff Law


On 11/12/2015 08:48 AM, Ilya Enkovich wrote:

Hi,

We may get ICE in vectorizer in case stored value get vectype not compatible 
with a storage.  This may happen for bool values.  This patch fixes ICE.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  

* tree-vect-stmts.c (vectorizable_mask_load_store): Check
types of stored value and storage are compatible.

gcc/testsuite/

2015-11-12  Ilya Enkovich  

* g++.dg/vect/simd-mask-store-bool.cc: New test.

OK.
jeff

[patch committed FT32] Pattern for CC to register move

2015-11-12 Thread James Bowman

The attached patch adds a pattern for CC to register move.

[gcc]

2015-11-11  James Bowman  

* config/ft32/ft32.md: New pattern *sne

Index: gcc/config/ft32/ft32.md
===
--- gcc/config/ft32/ft32.md (revision 230144)
+++ gcc/config/ft32/ft32.md (working copy)
@@ -255,6 +255,13 @@
 
 ;; SImode
 
+(define_insn "*sne"
+   [(set (match_operand:SI 0 "register_operand" "=r")
+ (reg:SI CC_REG))]
+   ""
+   "bextu.l %0,$cc,32|0\;xor.l %0,%0,-1"
+)
+
 ;; Push a register onto the stack
 (define_insn "movsi_push"
   [(set (mem:SI (pre_dec:SI (reg:SI SP_REG)))
@@ -884,6 +891,7 @@
   DONE;
 })
 
+
 (define_expand "epilogue"
   [(return)]
   ""

Re: [PATCH, doc] Document some standard pattern names

2015-11-12 Thread Jeff Law


On 11/12/2015 08:30 AM, Ilya Enkovich wrote:

Hi,

This patch adds description for several standard pattern names.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  

* doc/md.texi (vec_cmp@var{m}@var{n}): New item.
(vec_cmpu@var{m}@var{n}): New item.
(vcond@var{m}@var{n}): Specify comparison is signed.
(vcondu@var{m}@var{n}): New item.
(vcond_mask_@var{m}@var{n}): New item.
(maskload@var{m}@var{n}): New item.
(maskstore@var{m}@var{n}): New item.

OK.
jeff

Re: [PATCH] Enable libstdc++ numeric conversions on Cygwin

2015-11-12 Thread Andreas Tobler


On 12.11.15 14:39, Jonathan Wakely wrote:

On 12/11/15 11:40 +, Jonathan Wakely wrote:

On 18/09/15 12:01 -0400, Jennifer Yao wrote:

Forgot to include the patch.

On Fri, Sep 18, 2015 at 11:17 AM, Jennifer Yao
 wrote:

A number of functions in libstdc++ are guarded by the _GLIBCXX_USE_C99
preprocessor macro, which is only defined on systems that pass all of
the checks for a large set of C99 functions. Consequently, on systems
which lack any of the required C99 facilities (e.g. Cygwin, which
lacks some C99 complex math functions), the numeric conversion
functions (std::stoi(), std::stol(), std::to_string(), etc.) are not
defined—a rather silly outcome, as none of the numeric conversion
functions are implemented using C99 math functions.

This patch enables numeric conversion functions on the aforementioned
systems by splitting the checks for C99 support and defining several
new macros (_GLIBCXX_USE_C99_STDIO, _GLIBCXX_USE_C99_STDLIB, and
_GLIBCXX_USE_C99_WCHAR), which replace the use of _GLIBCXX_USE_C99 in
#if conditionals where appropriate.


(Coming back to this now that Jennifer's copyright assignment is
complete...)

Splitting the _GLIBCXX_USE_C99 macro into more fine-grained macros for
separate features is definitely the right direction.

However your patch also changes the configure tests to use -std=c++0x
(which should be -std=c++11, but that's a minor point). On an OS that
only makes the C99 library available conditionally that will mean that
configure determines that C99 library features are supported, but we
will get errors if we try to use those features in C++03 parts of the
library.

I think a more complete solution is to have two sets of configure
tests and two sets of macros, so that we define _GLIBCXX_USE_C99_STDIO
when C99 stdio is available unconditionally, and define
_GLIBCXX11_USE_C99_STDIO when it's available with -std=c++11.

Then in the library code we can check _GLIBCXX_USE_C99_STDIO if we
want to use C99 features in C++03 code, and check
_GLIBCXX11_USE_C99_STDIO if we want to use the features in C++11 code.

That should still solve the problem for the numeric conversion
functions, because they are defined in C++11 and so would check
_GLIBCXX11_USE_C99_STDIO, which will be defined for newlib.

Other pieces of the library, such as locales, will use
_GLIBCXX_USE_C99_STDIO and that might still be false for newlib (and
for other strict C libraries like the Solaris and FreeBSD libc).

I will make the changes to acinclude.m4 to duplicate the tests, so we
test once with -std=c++98 and once with -std=c++11, and then change
the library to check either _GLIBCXX_xxx or _GLIBCXX11_xxx as
appropriate.


Here's a patch implementing my suggestion.

The major changes since Jennifer's original patch are in acinclude.m4,
to do the autoconf tests once with -std=c++98 and again with
-std=c++11, and in include/bits/c++config to define the
_GLIBCXX_USE_C99_XXX macros according to either _GLIBCXX98_USE_CXX_XXX
or _GLIBCXX11_USE_CXX_XXX, depending on the standard mode in effect
when the file is included.

Because those new definitions in bits/c++config are unconditional I
had to adjust a few #ifdef tests to use #if instead.

I also removed the changes to GLIBCXX_CHECK_C99_TR1, so that there are
no changes to the macros used for the TR1 library. As a follow-up
change I will add a test for  to GLIBCXX_ENABLE_C99 and
change several C++ headers to stop using the TR1 macros.

This passes all tests on powerpc64le-linux, I'll also try to test on
DragonFly and FreeBSD.

Does this look good to everyone?

One downside of this change is that we introduce some (hopefully safe)
ODR violations, where inline functions and templates that depend on
_GLIBCXX_USE_C99_FOO might now be defined differently in C++98 and
C++11 code. Previously they had the same definition, even though in
C++11 mode the value of the _GLIBCXX_USE_C99_FOO macro might have been
sub-optimal (i.e. the C99 features were usable, but the macro said
they weren't). Those ODR violatiosn could be avoided if needed, by
always using the _GLIBCXX98_USE_C99_FOO macro in code that can be
included from either C++98 or C++11. We could still use the
_GLIBCXX11_USE_C99_FOO macro in pure C++11 code (such as the numeric
conversion functions) and get most of the benefit of this change.




I have successfully tested this patch on FreeBSD -CURRENT and also on 
CentOS7.1 (x86_64).


I had the previous version installed for a longer time and I see no 
regressions compared with it.


N.B: I didn't apply the follow up patch since you mentioned it does not 
work.


So, for me it looks good and I'm very happy to see this coming in.

Thank you all!!

Regards,
Andreas

[PATCH, PR tree-optimization/PR68305] Support masked COND_EXPR in SLP

2015-11-12 Thread Ilya Enkovich

Hi,

This patch fixes a way operand is chosen by its num for COND_EXPR.  
Bootstrapped and regtested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  

PR tree-optimization/68305
* tree-vect-slp.c (vect_get_constant_vectors): Support
COND_EXPR with SSA_NAME as a condition.

gcc/testsuite/

2015-11-12  Ilya Enkovich  

PR tree-optimization/68305
* gcc.dg/vect/pr68305.c: New test.


diff --git a/gcc/testsuite/gcc.dg/vect/pr68305.c 
b/gcc/testsuite/gcc.dg/vect/pr68305.c
new file mode 100644
index 000..fde3db7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr68305.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-additional-options "-mavx2" { target avx_runtime } } */
+
+int a, b;
+
+void
+fn1 ()
+{
+  int c, d;
+  for (; b; b++)
+a = a ^ !c ^ !d;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 9d97140..9402474 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2738,18 +2738,20 @@ vect_get_constant_vectors (tree op, slp_tree slp_node,
  switch (code)
{
  case COND_EXPR:
-   if (op_num == 0 || op_num == 1)
- {
-   tree cond = gimple_assign_rhs1 (stmt);
+   {
+ tree cond = gimple_assign_rhs1 (stmt);
+ if (TREE_CODE (cond) == SSA_NAME)
+   op = gimple_op (stmt, op_num + 1);
+ else if (op_num == 0 || op_num == 1)
op = TREE_OPERAND (cond, op_num);
- }
-   else
- {
-   if (op_num == 2)
- op = gimple_assign_rhs2 (stmt);
-   else
- op = gimple_assign_rhs3 (stmt);
- }
+ else
+   {
+ if (op_num == 2)
+   op = gimple_assign_rhs2 (stmt);
+ else
+   op = gimple_assign_rhs3 (stmt);
+   }
+   }
break;
 
  case CALL_EXPR:

[PATCH, alpha]: Hookize some more macros

2015-11-12 Thread Uros Bizjak

2015-11-12  Uros Bizjak  

* config/alpha/alpha.h (FUNCTION_VALUE, LIBCALL_VALUE,
FUNCTION_VALUE_REGNO_P): Remove.
* config/alpha/alpha-protos.h (function_value): Remove.
* config/alpha/alpha.c (function_value): Rename to...
(alpha_function_value_1): ... this.  Make static.
(alpha_function_value, alpha_libcall_value,
alpha_function_value_regno_p): New functions.
(TARGET_FUNCTION_VALUE, TARGET_LIBCALL_VALUE,
TARGET_FUNCTION_VALUE_REGNO_P): Define.

2015-11-12  Uros Bizjak  

* config/alpha/alpha.h (REGISTER_MOVE_COST, MEMORY_MOVE_COST): Remove.
* config/alpha/alpha.c (alpha_memory_latency): Make static.
(alpha_register_move_cost, alpha_memory_move_cost): New functions.
(TARGET_REGISTER_MOVE_COST, TARGET_MEMORY_MOVE_COST): Define.

Bootstrapped and regression tested on alphaev68-linux-gnu, committed
to mainline SVN.

Uros.
Index: config/alpha/alpha-protos.h
===
--- config/alpha/alpha-protos.h (revision 230213)
+++ config/alpha/alpha-protos.h (working copy)
@@ -68,7 +68,6 @@
 extern void alpha_initialize_trampoline (rtx, rtx, rtx, int, int, int);
 
 extern rtx alpha_va_arg (tree, tree);
-extern rtx function_value (const_tree, const_tree, machine_mode);
 
 extern void alpha_start_function (FILE *, const char *, tree);
 extern void alpha_end_function (FILE *, const char *, tree);
Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 230213)
+++ config/alpha/alpha.c(working copy)
@@ -95,7 +95,7 @@
 
 /* The number of cycles of latency we should assume on memory reads.  */
 
-int alpha_memory_latency = 3;
+static int alpha_memory_latency = 3;
 
 /* Whether the function needs the GP.  */
 
@@ -1339,6 +1339,36 @@
   return NULL_RTX;
 }
 
+/* Return the cost of moving between registers of various classes.  Moving
+   between FLOAT_REGS and anything else except float regs is expensive.
+   In fact, we make it quite expensive because we really don't want to
+   do these moves unless it is clearly worth it.  Optimizations may
+   reduce the impact of not being able to allocate a pseudo to a
+   hard register.  */
+
+static int
+alpha_register_move_cost (machine_mode /*mode*/,
+ reg_class_t from, reg_class_t to)
+{
+  if ((from == FLOAT_REGS) == (to == FLOAT_REGS))
+return 2;
+
+  if (TARGET_FIX)
+return (from == FLOAT_REGS) ? 6 : 8;
+
+  return 4 + 2 * alpha_memory_latency;
+}
+
+/* Return the cost of moving data of MODE from a register to
+   or from memory.  On the Alpha, bump this up a bit.  */
+
+static int
+alpha_memory_move_cost (machine_mode /*mode*/, reg_class_t /*regclass*/,
+   bool /*in*/)
+{
+  return 2 * alpha_memory_latency;
+}
+
 /* Compute a (partial) cost for rtx X.  Return true if the complete
cost has been computed, and false if subexpressions should be
scanned.  In either case, *TOTAL contains the cost result.  */
@@ -5736,9 +5766,9 @@
On Alpha the value is found in $0 for integer functions and
$f0 for floating-point functions.  */
 
-rtx
-function_value (const_tree valtype, const_tree func ATTRIBUTE_UNUSED,
-   machine_mode mode)
+static rtx
+alpha_function_value_1 (const_tree valtype, const_tree func ATTRIBUTE_UNUSED,
+   machine_mode mode)
 {
   unsigned int regnum, dummy ATTRIBUTE_UNUSED;
   enum mode_class mclass;
@@ -5793,6 +5823,33 @@
   return gen_rtx_REG (mode, regnum);
 }
 
+/* Implement TARGET_FUNCTION_VALUE.  */
+
+static rtx
+alpha_function_value (const_tree valtype, const_tree fn_decl_or_type,
+ bool /*outgoing*/)
+{
+  return alpha_function_value_1 (valtype, fn_decl_or_type, VOIDmode);
+}
+
+/* Implement TARGET_LIBCALL_VALUE.  */
+
+static rtx
+alpha_libcall_value (machine_mode mode, const_rtx /*fun*/)
+{
+  return alpha_function_value_1 (NULL_TREE, NULL_TREE, mode);
+}
+
+/* Implement TARGET_FUNCTION_VALUE_REGNO_P.
+
+   On the Alpha, $0 $1 and $f0 $f1 are the only register thus used.  */
+
+static bool
+alpha_function_value_regno_p (const unsigned int regno)
+{
+  return (regno == 0 || regno == 1 || regno == 32 || regno == 33);
+}
+
 /* TCmode complex values are passed by invisible reference.  We
should not split these values.  */
 
@@ -9908,6 +9965,10 @@
 #undef TARGET_USE_BLOCKS_FOR_CONSTANT_P
 #define TARGET_USE_BLOCKS_FOR_CONSTANT_P hook_bool_mode_const_rtx_true
 
+#undef TARGET_REGISTER_MOVE_COST
+#define TARGET_REGISTER_MOVE_COST alpha_register_move_cost
+#undef TARGET_MEMORY_MOVE_COST
+#define TARGET_MEMORY_MOVE_COST alpha_memory_move_cost
 #undef TARGET_RTX_COSTS
 #define TARGET_RTX_COSTS alpha_rtx_costs
 #undef TARGET_ADDRESS_COST
@@ -9920,6 +9981,13 @@
 #define TARGET_PROMOTE_FUNCTION_MODE 
default_promote_function_mode_always_promote
 #undef TARGET_PROMOTE_PROTOTYPES
 #define TARGET_PROMOTE_PROTOTYPES

RE: [RFC][PATCH] Preferred rename register in regrename pass

2015-11-12 Thread Robert Suchanek

Hi Christophe,

> >
> Hi,
> I confirm that this fixes the build errors I was seeing.
> Thanks.
> 

Thanks for checking this.

I'm still seeing a number of ICEs on the gcc-testresults mailing list
across various ports but these are likely to be caused another patch.
They are already reported as PR68293 and PR68296.

Regards,
Robert

[Ada] More efficient code generated for object overlays

2015-11-12 Thread Arnaud Charlet

This change refines the use of the "volatile hammer" to implement the advice
given in RM 13.3(19) by disabling it for object overlays altogether. relying
instead on the ref-all aliasing property of reference types to achieve the
desired effect.

This will generate better code for object overlays, for example the following
function should now make no memory accesses at all on 64-bit platforms when
compiled at -O2 or above:

package Vec is

  type U64 is mod 2**64;

  function Prod (A, B : U64) return U64;

end Vec;
package body Vec is

  function Prod (A, B : U64) return U64 is
type U16 is mod 2**16;
type V16 is array (1..4) of U16;
VA : V16;
for VA'Address use A'Address;
VB : V16;
for VB'Address use B'Address;
R : U64 := 0;
  begin
for I in V16'Range loop
  R := R + U64(VA (I)) * U64(VB (I));
end loop;
return R;
  end;

end Vec;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Eric Botcazou  

* sem_ch13.adb (Analyze_Attribute_Definition_Clause): For a
variable, if this is not an overlay, set on Treat_As_Volatile on it.
* gcc-interface/decl.c (E_Variable): Do not force the type to volatile
for address clauses. Tweak and adjust various RM references.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 230229)
+++ sem_ch13.adb(working copy)
@@ -4724,10 +4724,24 @@
 
   Find_Overlaid_Entity (N, O_Ent, Off);
 
-  --  If the object overlays a constant view, mark it so
+  if Present (O_Ent) then
+ --  If the object overlays a constant object, mark it so
 
-  if Present (O_Ent) and then Is_Constant_Object (O_Ent) then
- Set_Overlays_Constant (U_Ent);
+ if Is_Constant_Object (O_Ent) then
+Set_Overlays_Constant (U_Ent);
+ end if;
+  else
+ --  If this is not an overlay, mark a variable as being
+ --  volatile to prevent unwanted optimizations. It's a
+ --  conservative interpretation of RM 13.3(19) for the
+ --  cases where the compiler cannot detect potential
+ --  aliasing issues easily and it also covers the case
+ --  of an absolute address where the volatile aspect is
+ --  kind of implicit.
+
+ if Ekind (U_Ent) = E_Variable then
+Set_Treat_As_Volatile (U_Ent);
+ end if;
   end if;
 
   --  Overlaying controlled objects is erroneous.
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c(revision 230229)
+++ gcc-interface/decl.c(working copy)
@@ -1068,14 +1068,12 @@
  }
 
/* Make a volatile version of this object's type if we are to make
-  the object volatile.  We also interpret 13.3(19) conservatively
-  and disallow any optimizations for such a non-constant object.  */
+  the object volatile.  We also implement RM 13.3(19) for exported
+  and imported (non-constant) objects by making them volatile.  */
if ((Treat_As_Volatile (gnat_entity)
 || (!const_flag
 && gnu_type != except_type_node
-&& (Is_Exported (gnat_entity)
-|| imported_p
-|| Present (Address_Clause (gnat_entity)
+&& (Is_Exported (gnat_entity) || imported_p)))
&& !TYPE_VOLATILE (gnu_type))
  {
const int quals
@@ -1118,7 +1116,8 @@
  gnu_expr = convert (gnu_type, gnu_expr);
 
/* If this is a pointer that doesn't have an initializing expression,
-  initialize it to NULL, unless the object is imported.  */
+  initialize it to NULL, unless the object is declared imported as
+  per RM B.1(24).  */
if (definition
&& (POINTER_TYPE_P (gnu_type) || TYPE_IS_FAT_POINTER_P (gnu_type))
&& !gnu_expr
@@ -1141,7 +1140,7 @@
save_gnu_tree (gnat_entity, NULL_TREE, false);
 
/* Convert the type of the object to a reference type that can
-  alias everything as per 13.3(19).  */
+  alias everything as per RM 13.3(19).  */
gnu_type
  = build_reference_type_for_mode (gnu_type, ptr_mode, true);
gnu_address = convert (gnu_type, gnu_address);
@@ -1206,11 +1205,10 @@
   as an indirect object.  Likewise for Stdcall objects that are
   imported.  */
if ((!definition && Present (Address_Clause (gnat_entity)))
-   || (Is_Imported (gnat_entity)
-   && Has_Stdcall_Convention (gnat_entity)))
+   ||

Re: [OpenACC 0/7] host_data construct

2015-11-12 Thread Julian Brown

On Mon, 2 Nov 2015 18:33:39 +
Julian Brown  wrote:

> On Mon, 26 Oct 2015 19:34:22 +0100
> Jakub Jelinek  wrote:
> 
> > Your use_device sounds very similar to use_device_ptr clause in
> > OpenMP, which is allowed on #pragma omp target data construct and is
> > implemented quite a bit differently from this; it is unclear if the
> > OpenACC standard requires this kind of implementation, or you just
> > chose to implement it this way.  In particular, the GOMP_target_data
> > call puts the variables mentioned in the use_device_ptr clauses into
> > the mapping structures (similarly how map clause appears) and the
> > corresponding vars are privatized within the target data region
> > (which is a host region, basically a fancy { } braces), where the
> > private variables contain the offloading device's pointers.  
> 
> As the author of the original patch, I have to say using the mapping
> structures seems like a far better approach, but I've hit some trouble
> with the details of adapting OpenACC to use that method.

Here's a version of the patch which (hopefully) brings OpenACC on par
with OpenMP with respect to use_device/use_device_ptr variables. The
implementation is essentially the same now for OpenACC as for OpenMP
(i.e. using mapping structures): so for now, only array or pointer
variables can be used as use_device variables. The included tests have
been adjusted accordingly.

One awkward part of the implementation concerns nesting offloaded
regions within host_data regions:

#define N 1024

int main (int argc, char* argv[])
{
  int x[N];

#pragma acc data copyin (x[0:N])
  {
int *xp;
#pragma acc host_data use_device (x)
{
  [...]
#pragma acc parallel present (x) copyout (xp)
  {
xp = x;
  }
}

assert (xp == acc_deviceptr (x));
  }

  return 0;
}

I think the meaning of 'x' as seen within the clauses of the parallel
directive should be the *host* version of x, not the mapped target
address (I've asked on the OpenACC technical mailing list to clarify
this point, but no reply as yet). The changes to
{maybe_,}lookup_decl_in_outer_ctx "skip over" host_data contexts when
called from lower_omp_target. There's probably an analogous case for
OpenMP, but I've not tried to handle that.

No regressions for libgomp tests, and the new tests pass. OK for trunk?

Thanks,

Julian

ChangeLog

Julian Brown  
Cesar Philippidis  
James Norris  

gcc/
* c-family/c-pragma.c (oacc_pragmas): Add PRAGMA_OACC_HOST_DATA.
* c-family/c-pragma.h (pragma_kind): Add PRAGMA_OACC_HOST_DATA.
(pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_USE_DEVICE.
* c/c-parser.c (c_parser_omp_clause_name): Add use_device support.
(c_parser_oacc_clause_use_device): New function.
(c_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(c_parser_oacc_host_data): New function.
(c_parser_omp_construct): Add host_data support.
* c/c-tree.h (c_finish_oacc_host_data): Add prototype.
* c/c-typeck.c (c_finish_oacc_host_data): New function.
(c_finish_omp_clauses): Add use_device support.
* cp/cp-tree.h (finish_oacc_host_data): Add prototype.
* cp/parser.c (cp_parser_omp_clause_name): Add use_device support.
(cp_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(cp_parser_oacc_host_data): New function.
(cp_parser_omp_construct): Add host_data support.
(cp_parser_pragma): Add host_data support.
* cp/semantics.c (finish_omp_clauses): Add use_device support.
(finish_oacc_host_data): New function.
* gimple-pretty-print.c (dump_gimple_omp_target): Add host_data
support.
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_HOST_DATA.
(is_gimple_omp_oacc): Add support for above.
* gimplify.c (gimplify_scan_omp_clauses): Add host_data, use_device
support.
(gimplify_omp_workshare): Add host_data support.
(gimplify_expr): Likewise.
* omp-builtins.def (BUILT_IN_GOACC_HOST_DATA): New.
* omp-low.c (lookup_decl_in_outer_ctx)
(maybe_lookup_decl_in_outer_ctx): Add optional argument to skip
host_data regions.
(scan_sharing_clauses): Support use_device.
(check_omp_nesting_restrictions): Support host_data.
(expand_omp_target): Support host_data.
(lower_omp_target): Skip over outer host_data regions when looking
up decls. Support use_device.
(make_gimple_omp_edges): Support host_data.
* tree-nested.c (convert_nonlocal_omp_clauses): Add use_device
clause.

libgomp/
* oacc-parallel.c (GOACC_host_data): New function.
* libgomp.map (GOACC_host_data): Add to GOACC_2.0.1.
* testsuite/libgomp.oacc-c-c++-common/host_data-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-3.c: New test.
*

Re: [PATCH, PR68286] Fix vector comparison expand

2015-11-12 Thread Richard Biener

On Thu, Nov 12, 2015 at 10:57 AM, Ilya Enkovich  wrote:
> Hi,
>
> My vector comparison patches broken expand of vector comparison on targets 
> which don't have new comparison patterns but support VEC_COND_EXPR.  This 
> happens because it's not checked vector comparison may be expanded as a 
> comparison.  This patch fixes it.  Bootstrapped and regtested on 
> powerpc64le-unknown-linux-gnu.  OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2015-11-12  Ilya Enkovich  
>
> * expr.c (do_store_flag): Expand vector comparison as
> VEC_COND_EXPR if vector comparison is not supported
> by target.
>
> gcc/testsuite/
>
> 2015-11-12  Ilya Enkovich  
>
> * gcc.dg/pr68286.c: New test.
>
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 03936ee..bd43dc4 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -11128,7 +11128,8 @@ do_store_flag (sepops ops, rtx target, machine_mode 
> mode)
>if (TREE_CODE (ops->type) == VECTOR_TYPE)
>  {
>tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
> -  if (VECTOR_BOOLEAN_TYPE_P (ops->type))
> +  if (VECTOR_BOOLEAN_TYPE_P (ops->type)
> + && expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type))
> return expand_vec_cmp_expr (ops->type, ifexp, target);
>else
> {
> diff --git a/gcc/testsuite/gcc.dg/pr68286.c b/gcc/testsuite/gcc.dg/pr68286.c
> new file mode 100644
> index 000..d0392e8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr68286.c
> @@ -0,0 +1,17 @@
> +/* PR target/68286 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +int a, b, c;
> +int fn1 ()
> +{
> +  int d[] = {0};
> +  for (; c; c++)
> +{
> +  float e = c;
> +  if (e)
> +d[0]++;
> +}
> +  b = d[0];
> +  return a;
> +}

[Ada] Crash on inconsistent IF-expression

2015-11-12 Thread Arnaud Charlet

This change makes sure the compiler produces a proper error (rather
than crash) when compiling an (illegal) IF-expression where THEN-expression
is overloaded, and none of its interpretation is compatible with
the ELSE-expression.

The following compilation must display:

$ gcc -c badelse.adb
badelse.adb:4:50: type incompatible with that of "then" expression

package Badelse is
   type K is (Unknown, Blue, Red);
   type Tristate is (False, True, Unknown);
   Boo : Boolean;
   procedure P (X : K);
end Badelse;
package body Badelse is
   procedure P (X : K) is
   begin
  Boo := (if X = Unknown then Unknown else X = Blue);
   end P;
end Badelse;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Thomas Quinot  

* sem_ch4.adb (analyze_If_Expression): Reject IF-expression where
THEN-expression is overloaded and none of its interpretation is
compatible with the ELSE-expression.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 230239)
+++ sem_ch4.adb (working copy)
@@ -2191,6 +2191,17 @@
 
Get_Next_Interp (I, It);
 end loop;
+
+--  If no valid interpretation has been found, then the type of
+--  the ELSE expression does not match any interpretation of
+--  the THEN expression.
+
+if Etype (N) = Any_Type then
+   Error_Msg_N
+ ("type incompatible with that of `THEN` expression",
+  Else_Expr);
+   return;
+end if;
  end;
   end if;
end Analyze_If_Expression;

Re: [PATCH] [ARM/Aarch64] add initial Qualcomm support

2015-11-12 Thread James Greenhalgh

On Wed, Nov 11, 2015 at 10:34:53AM -0800, Jim Wilson wrote:
> This adds an option for the Qualcomm server parts, qdf24xx, just
> optimizing like a cortex-a57 for now, same as how the initial Samsung
> exynos-m1 support worked.
> 
> This was tested with armv8 and aarch64 bootstraps and make check.
> 
> I had to disable the cortex-a57 fma steering pass in the aarch64 port
> while testing the patch.  A bootstrap for aarch64 configured
> --with-cpu=cortex-a57 gives multiple ICEs while building the stage1
> libstdc++.  The ICEs are in scan_rtx_reg at regrename.c:1074.  This
> looks vaguely similar to PR 66785.
> 
> I am also seeing extra make check failures due to ICEs with armv8
> bootstrap builds configured --with-cpu=cortex-a57,  I see ICEs in
> scan_rtx_reg in regrename, and ICEs in decompose_normal_address in
> rtlanal.c.  The arm port doesn't have the fma steering support, which
> seems odd, and is maybe a bug, so it isn't clear what is causing this
> problem.
> 
> I plan to look at these aarch64 and armv8 failures next, including PR
> 66785.  None of these have anything to do with my patch, as they
> trigger for cortex-a57 which is already supported.

The bootstrap bugs should be fixed on trunk as of:

  http://gcc.gnu.org/viewcvs/gcc?view=revision=230149

The AArch64 parts are OK, but the ARM parts look to be missing a hunk to
gcc/config/arm/t-aprofile (and I can't approve those anyway).

Thanks,
James


> Index: gcc/ChangeLog
> ===
> --- gcc/ChangeLog (revision 230118)
> +++ gcc/ChangeLog (working copy)
> @@ -1,3 +1,13 @@
> +2015-11-10  Jim Wilson  
> +
> + * config/aarch64/aarch64-cores.def (qdf24xx): New.
> + * config/aarch64/aarch64-tune.md: Regenerated.
> + * config/arm/arm-cores.def (qdf24xx): New.
> + * config/arm/arm-tables.opt, config/arm/arm-tune.md: Regenerated.
> + * config/arm/bpabi.h (BE8_LINK_SPEC): Add qdf24xx support.
> + * doc/invoke.texi (AArch64 Options/-mtune): Add "qdf24xx".
> + (ARM Options/-mtune); Likewise.

Re: [mask-vec_cond, patch 1/2] Support vectorization of VEC_COND_EXPR with no embedded comparison

2015-11-12 Thread Ilya Enkovich

2015-11-12 13:03 GMT+03:00 Ramana Radhakrishnan :
> On Thu, Oct 8, 2015 at 4:50 PM, Ilya Enkovich  wrote:
>> Hi,
>>
>> This patch allows COND_EXPR with no embedded comparison to be vectorized.
>>  It's applied on top of vectorized comparison support series.  New optab 
>> vcond_mask_optab
>> is introduced for such statements.  Bool patterns now avoid comparison in 
>> COND_EXPR in case vector comparison is supported by target.
>
> New standard pattern names are documented in the internals manual.
> This patch does not do so neither do I see any patches to do so.
>
>
> regards
> Ramana

Thanks for the point.  I see we also miss description for some other
patterns (e.g. maskload). Will add it.

Ilya

[visium] Remove obsolete prototypes

2015-11-12 Thread Eric Botcazou

Tested on visium-elf, applied on the mainline.


2015-11-12  Eric Botcazou  

* config/visium/visium-protos.h (notice_update_cc): Delete.
(print_operand): Likewise.
(print_operand_address): Likewise.

-- 
Eric BotcazouIndex: config/visium/visium-protos.h
===
--- config/visium/visium-protos.h	(revision 230204)
+++ config/visium/visium-protos.h	(working copy)
@@ -49,9 +49,6 @@ extern void visium_split_cbranch (enum r
 extern const char *output_ubranch (rtx, rtx_insn *);
 extern const char *output_cbranch (rtx, enum rtx_code, enum machine_mode, int,
    rtx_insn *);
-extern void notice_update_cc (rtx, rtx);
-extern void print_operand (FILE *, rtx, int);
-extern void print_operand_address (FILE *, rtx);
 extern void split_double_move (rtx *, enum machine_mode);
 extern void visium_expand_copysign (rtx *, enum machine_mode);
 extern void visium_expand_int_cstore (rtx *, enum machine_mode);

Re: [hsa 5/12] New HSA-related GCC options

2015-11-12 Thread Jakub Jelinek

On Mon, Nov 09, 2015 at 05:58:56PM +0100, Martin Jambor wrote:
> > But I don't see any way to disable it on the command line?  (no switch?)
> 
> No, the switch is -foffload, which has missing documentation (PR
> 67300) and is only described at https://gcc.gnu.org/wiki/Offloading
> Nevertheless, the option allows the user to specify compiler option
> -foffload=disable and no offloading should happen, not even HSA.  The
> user can also enumerate just the offload targets they want (and pass
> them special command line stuff).
> 
> It seems I have misplaced a hunk in the patch series.  Nevertheless,
> in the first patch (with configuration stuff), there is a change to
> opts.c which scans the -foffload= contents and sets the flag variable
> if hsa is not present.
> 
> Whenever the compiler has to decide whether HSA is enabled for the
> given compilation or not, it has to look at this variable (if
> configured for HSA).

Buut what is the difference between
-foffload=disable
or
-foffload={list not including hsa}
and the new param?  If you don't gridify, you don't emit any kernels...

Jakub

Re: [PATCH] Fix PR ipa/68035 (v2)

2015-11-12 Thread Richard Biener

On Thu, Nov 12, 2015 at 10:48 AM, Martin Liška  wrote:
> Hello.
>
> I'm sending reworked version of the patch, where I renamed 'sem_item::hash' 
> to 'm_hash'
> and wrapped all usages with 'get_hash'. Apart from that, a new member 
> function 'set_hash'
> is utilized for changing the hash value. Hope it's easier for understanding.
>
> Patch can survive regression tests and bootstraps on x86_64-linux-pc.
>
> Ready for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Martin

Re: [PATCH 04/N] Fix big memory leak in ix86_valid_target_attribute_p

2015-11-12 Thread Bernd Schmidt


On 11/12/2015 12:29 PM, Richard Biener wrote:

+static bool opts_obstack_initialized = false;
+
+/* Initialize opts_obstack if not initialized.  */
+
+void
+init_opts_obstack (void)
+{
+  if (!opts_obstack_initialized)
+{
+  opts_obstack_initialized = true;
+  gcc_obstack_init (_obstack);

you can move the static global to function scope.


Also, why bother with it? Why not simply arrange to call the function 
just once at startup?


It's not clear from the submission why this is done and how it relates 
to the i386.c hunk.



Bernd

[Ada] Legality checks on calls to a Generic_Dispatching_Constructor.

2015-11-12 Thread Arnaud Charlet

This patch adds several legality checks on calls to an instance of the
predefined Generic_Dispatchin_Constructor. The following three tests are
performed:

a) The tag argument is defined, i.e. is not No_Tag.

b) The  tag is not that of an abstract type.

c) The accessibility level of the type denoted by the tag is no greater than
that of the specified constructor function.

Tested in  ACATS 4.0H C390012.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Ed Schonberg  

* exp_intr.adb: Add legality checks on calls to a
Generic_Dispatching_Constructor: the given tag must be defined,
it cannot be the tag of an abstract type, and its accessibility
level must not be greater than that of the constructor.

Index: exp_intr.adb
===
--- exp_intr.adb(revision 230223)
+++ exp_intr.adb(working copy)
@@ -311,6 +311,31 @@
 
   Remove_Side_Effects (Tag_Arg);
 
+  --  Check that we have a proper tag
+
+  Insert_Action (N,
+Make_Implicit_If_Statement (N,
+  Condition   => Make_Op_Eq (Loc,
+Left_Opnd  => New_Copy_Tree (Tag_Arg),
+Right_Opnd => New_Occurrence_Of (RTE (RE_No_Tag), Loc)),
+
+  Then_Statements => New_List (
+Make_Raise_Statement (Loc,
+  New_Occurrence_Of (RTE (RE_Tag_Error), Loc);
+
+  --  Check that it is not the tag of an abstract type
+
+  Insert_Action (N,
+Make_Implicit_If_Statement (N,
+  Condition   => Make_Function_Call (Loc,
+ Name   =>
+   New_Occurrence_Of (RTE (RE_Type_Is_Abstract), Loc),
+ Parameter_Associations => New_List (New_Copy_Tree (Tag_Arg))),
+
+  Then_Statements => New_List (
+Make_Raise_Statement (Loc,
+  New_Occurrence_Of (RTE (RE_Tag_Error), Loc);
+
   --  The subprogram is the third actual in the instantiation, and is
   --  retrieved from the corresponding renaming declaration. However,
   --  freeze nodes may appear before, so we retrieve the declaration
@@ -324,6 +349,22 @@
   Act_Constr := Entity (Name (Act_Rename));
   Result_Typ := Class_Wide_Type (Etype (Act_Constr));
 
+  --  Check that the accessibility level of the tag is no deeper than that
+  --  of the constructor function.
+
+  Insert_Action (N,
+Make_Implicit_If_Statement (N,
+  Condition   =>
+Make_Op_Gt (Loc,
+  Left_Opnd  =>
+Build_Get_Access_Level (Loc, New_Copy_Tree (Tag_Arg)),
+  Right_Opnd =>
+Make_Integer_Literal (Loc, Scope_Depth (Act_Constr))),
+
+  Then_Statements => New_List (
+Make_Raise_Statement (Loc,
+  New_Occurrence_Of (RTE (RE_Tag_Error), Loc);
+
   if Is_Interface (Etype (Act_Constr)) then
 
  --  If the result type is not known to be a parent of Tag_Arg then we
@@ -390,7 +431,6 @@
   --  conversion of the call to the actual constructor.
 
   Rewrite (N, Convert_To (Result_Typ, Cnstr_Call));
-  Analyze_And_Resolve (N, Etype (Act_Constr));
 
   --  Do not generate a run-time check on the built object if tag
   --  checks are suppressed for the result type or tagged type expansion
@@ -458,6 +498,8 @@
  Make_Raise_Statement (Loc,
Name => New_Occurrence_Of (RTE (RE_Tag_Error), Loc);
   end if;
+
+  Analyze_And_Resolve (N, Etype (Act_Constr));
end Expand_Dispatching_Constructor_Call;
 
---
Index: rtsfind.ads
===
--- rtsfind.ads (revision 230223)
+++ rtsfind.ads (working copy)
@@ -640,6 +640,7 @@
  RE_Max_Predef_Prims,-- Ada.Tags
  RE_Needs_Finalization,  -- Ada.Tags
  RE_No_Dispatch_Table_Wrapper,   -- Ada.Tags
+ RE_No_Tag,  -- Ada.Tags
  RE_NDT_Prims_Ptr,   -- Ada.Tags
  RE_NDT_TSD, -- Ada.Tags
  RE_Num_Prims,   -- Ada.Tags
@@ -1871,6 +1872,7 @@
  RE_Max_Predef_Prims => Ada_Tags,
  RE_Needs_Finalization   => Ada_Tags,
  RE_No_Dispatch_Table_Wrapper=> Ada_Tags,
+ RE_No_Tag   => Ada_Tags,
  RE_NDT_Prims_Ptr=> Ada_Tags,
  RE_NDT_TSD  => Ada_Tags,
  RE_Num_Prims=> Ada_Tags,

[RFC] Remove first_pass_instance from pass_vrp

2015-11-12 Thread Tom de Vries


Hi,

[ See also related discussion at 
https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00452.html ]


this patch removes the usage of first_pass_instance from pass_vrp.

the patch:
- limits itself to pass_vrp, but my intention is to remove all
  usage of first_pass_instance
- lacks an update to gdbhooks.py

Modifying the pass behaviour depending on the instance number, as 
first_pass_instance does, break compositionality of the pass list. In 
other words, adding a pass instance in a pass list may change the 
behaviour of another instance of that pass in the pass list. Which 
obviously makes it harder to understand and change the pass list. [ I've 
filed this issue as PR68247 - Remove pass_first_instance ]


The solution is to make the difference in behaviour explicit in the pass 
list, and no longer change behaviour depending on instance number.


One obvious possible fix is to create a duplicate pass with a different 
name, say 'pass_vrp_warn_array_bounds':

...
  NEXT_PASS (pass_vrp_warn_array_bounds);
  ...
  NEXT_PASS (pass_vrp);
...

But, AFAIU that requires us to choose a different dump-file name for 
each pass. And choosing vrp1 and vrp2 as new dump-file names still means 
that -fdump-tree-vrp no longer works (which was mentioned as drawback 
here: https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00453.html ).


This patch instead makes pass creation parameterizable. So in the pass 
list, we use:

...
  NEXT_PASS_WITH_ARG (pass_vrp, true /* warn_array_bounds_p */);
  ...
  NEXT_PASS_WITH_ARG (pass_vrp, false /* warn_array_bounds_p */);
...

This approach gives us clarity in the pass list, similar to using a 
duplicate pass 'pass_vrp_warn_array_bounds'.


But it also means -fdump-tree-vrp still works as before.

Good idea? Other comments?

Thanks,
- Tom
Remove first_pass_instance from pass_vrp

---
 gcc/gen-pass-instances.awk | 32 ++--
 gcc/pass_manager.h |  2 ++
 gcc/passes.c   | 20 
 gcc/passes.def |  4 ++--
 gcc/tree-pass.h|  3 ++-
 gcc/tree-vrp.c | 22 --
 6 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/gcc/gen-pass-instances.awk b/gcc/gen-pass-instances.awk
index cbfaa86..c77bd64 100644
--- a/gcc/gen-pass-instances.awk
+++ b/gcc/gen-pass-instances.awk
@@ -43,7 +43,7 @@ function handle_line()
 	line = $0;
 
 	# Find call expression.
-	call_starts_at = match(line, /NEXT_PASS \(.+\)/);
+	call_starts_at = match(line, /NEXT_PASS(_WITH_ARG)? \(.+\)/);
 	if (call_starts_at == 0)
 	{
 		print line;
@@ -53,23 +53,28 @@ function handle_line()
 	# Length of the call expression.
 	len_of_call = RLENGTH;
 
-	len_of_start = length("NEXT_PASS (");
 	len_of_open = length("(");
 	len_of_close = length(")");
 
-	# Find pass_name argument
-	len_of_pass_name = len_of_call - (len_of_start + len_of_close);
-	pass_starts_at = call_starts_at + len_of_start;
-	pass_name = substr(line, pass_starts_at, len_of_pass_name);
-
 	# Find call expression prefix (until and including called function)
-	prefix_len = pass_starts_at - 1 - len_of_open;
-	prefix = substr(line, 1, prefix_len);
+	match(line, /NEXT_PASS(_WITH_ARG)? /)
+	len_of_call_name = RLENGTH
+	prefix_len = call_starts_at + len_of_call_name - 1
+	prefix = substr(line, 1, prefix_len)
 
 	# Find call expression postfix
 	postfix_starts_at = call_starts_at + len_of_call;
 	postfix = substr(line, postfix_starts_at);
 
+	args_starts_at = prefix_len + 1 + len_of_open;
+	len_of_args = postfix_starts_at - args_starts_at - len_of_close;
+	args_str = substr(line, args_starts_at, len_of_args);
+	split(args_str, args, ",");
+
+	# Find pass_name argument, an optional with_arg argument
+	pass_name = args[1];
+	with_arg = args[2];
+
 	# Set pass_counts
 	if (pass_name in pass_counts)
 		pass_counts[pass_name]++;
@@ -79,7 +84,14 @@ function handle_line()
 	pass_num = pass_counts[pass_name];
 
 	# Print call expression with extra pass_num argument
-	printf "%s(%s, %s)%s\n", prefix, pass_name, pass_num, postfix;
+	printf "%s(", prefix;
+	printf "%s", pass_name;
+	printf ", %s", pass_num;
+	if (with_arg)
+	{
+		printf ", %s", with_arg;
+	}
+	printf ")%s\n", postfix;
 }
 
 { handle_line() }
diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index 7d539e4..a8199e2 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -120,6 +120,7 @@ private:
 #define PUSH_INSERT_PASSES_WITHIN(PASS)
 #define POP_INSERT_PASSES()
 #define NEXT_PASS(PASS, NUM) opt_pass *PASS ## _ ## NUM
+#define NEXT_PASS_WITH_ARG(PASS, NUM, ARG) NEXT_PASS (PASS, NUM)
 #define TERMINATE_PASS_LIST()
 
 #include "pass-instances.def"
@@ -128,6 +129,7 @@ private:
 #undef PUSH_INSERT_PASSES_WITHIN
 #undef POP_INSERT_PASSES
 #undef NEXT_PASS
+#undef NEXT_PASS_WITH_ARG
 #undef TERMINATE_PASS_LIST
 
 }; // class pass_manager
diff --git a/gcc/passes.c b/gcc/passes.c
index dd8d00a..0fd365e 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -81,6 +81,12 @@ opt_pass::clone ()
   internal_error

[Ada] Obscure messages due to freezing of contracts

2015-11-12 Thread Arnaud Charlet

This patch classifies a misplaced constituent as a critical error and stops the
compilation. This ensures that the missing link between a constituent and state
will not cause obscure cascaded errors.


-- Source --


--  pack.ads

package Pack
   with Spark_Mode => On,
Abstract_State => Top_State,
Initializes=> Top_State
is
   procedure Do_Something (Value   : in out Natural;
   Success :out Boolean)
   with Global  => (In_Out => Top_State),
Depends => (Value =>+ Top_State,
Success   => (Value, Top_State),
Top_State =>+ Value);
end Pack;

--  pack.adb

package body Pack
   with SPARK_Mode=> On,
Refined_State => (Top_State => (Count, A_Pack.State))
is
   package A_Pack
  with Abstract_State => State,
   Initializes=> State
   is
  procedure A_Proc (Test : in out Natural)
 with Global   => (In_Out =>  State),
  Depends  => (Test   =>+ State,
   State  =>+ Test);
   end A_Pack;

   package body A_Pack
  with Refined_State => (State => Total)
   is
  Total : Natural := 0;

  procedure A_Proc (Test : in out Natural)
 with Refined_Global  => (In_Out => Total),
  Refined_Depends => ((Test  =>+ Total,
   Total =>+ Test)) is
  begin
 if Total > Natural'Last - Test   then
Total := abs (Total - Test);
 else
Total := Total + Test;
 end if;
 Test := Total;
  end A_Proc;
   end A_Pack;

   Count : Natural := 0;

   procedure Do_Something (Value   : in out Natural;
   Success :out Boolean)
  with Refined_Global  => (In_Out  =>  (Count, A_Pack.State)),
   Refined_Depends => (Value=>+ (Count, A_Pack.State),
   Success  =>  (Value, Count, A_Pack.State),
   Count=>+ null,
   A_Pack.State =>+ (Count, Value)) is
   begin
  Count := Count rem 128;
  if Count <= Value then
 Value := Count + (Value - Count) / 2;
  else
 Value := Value + (Count - Value) / 2;
  end if;
  A_Pack.A_Proc (Value);
  Success := Value /= 0;
   end Do_Something;
end Pack;


-- Compilation and output --


$ gcc -c pack.adb
pack.adb:3:09: body "A_Pack" declared at line 15 freezes the contract of "Pack"
pack.adb:3:09: all constituents must be declared before body at line 15
pack.adb:3:41: "Count" is undefined
compilation abandoned due to previous error

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Hristian Kirtchev  

* sem_prag.adb (Analyze_Constituent): Stop the
analysis after detecting a misplaced constituent as this is a
critical error.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 230236)
+++ sem_prag.adb(working copy)
@@ -25408,6 +25408,14 @@
 SPARK_Msg_N
   ("\all constituents must be declared before body #",
N);
+
+--  A misplaced constituent is a critical error because
+--  pragma Refined_Depends or Refined_Global depends on
+--  the proper link between a state and a constituent.
+--  Stop the compilation, as this leads to a multitude
+--  of misleading cascaded errors.
+
+raise Program_Error;
  end if;
 
   --  The constituent is a valid state or object

Re: State of support for the ISO C++ Transactional Memory TS and remanining work

2015-11-12 Thread Torvald Riegel

On Wed, 2015-11-11 at 15:04 +, Szabolcs Nagy wrote:
> On 10/11/15 18:29, Torvald Riegel wrote:
> > On Tue, 2015-11-10 at 17:26 +, Szabolcs Nagy wrote:
> >> On 09/11/15 00:19, Torvald Riegel wrote:
> >>> I've not yet created tests for the full list of functions specified as
> >>> transaction-safe in the TS, but my understanding is that this list was
> >>> created after someone from the ISO C++ TM study group looked at libstdc
> >>> ++'s implementation and investigated which functions might be feasible
> >>> to be declared transaction-safe in it.
> >>>
> >>
> >> is that list available somewhere?
> >
> > See the TM TS, N4514.
> >
> 
> i was looking at an older version,
> things make more sense now.
> 
> i think system() should not be transaction safe..
> 
> i wonder what's the plan for getting libc functions
> instrumented (i assume that is needed unless hw
> support is used).

No specific plans so far.  We'll wait and see, I guess.  TM is still in
a chicken-and-egg situation.

> >> xmalloc
> >> the runtime exits on memory allocation failure,
> >> so it is not possible to use it safely.
> >> (i think it should be possible to roll back the
> >> transaction in case of internal allocation failure
> >> and retry with a strategy that does not need dynamic
> >> allocation).
> >
> > Not sure what you mean by "safely".  Hardening against out-of-memory
> > situations hasn't been considered to be of high priority so far, but I'd
> > accept patches for that that don't increase complexity signifantly and
> > don't hamper performance.
> >
> 
> i consider this a library safety issue.
> 
> (a library or runtime is not safe to use if it may terminate
> the process in case of internal failures.)

If it is truly a purely internal failure, then aborting might be the
best thing one can do if there is no sensible way to try to recover from
the error (ie, take a fail-fast approach).
Out-of-memory errors are not purely internal failures.  I agree that it
would be nice to have a fallback, but for some features there simply is
none (eg, the program can't require rollback to be possible and yet not
provide enough memory for this to be achievable).  Given that this
transactions have to be used from C programs too, there's not much
libitm can do except perhaps call user-supplied handlers.

> >> uint64_t GTM::gtm_spin_count_var = 1000;
> >> i guess this was supposed to be tunable.
> >> it seems libitm needs some knobs (strategy, retries,
> >> spin counts), but there is no easy way to adapt these
> >> for a target/runtime environment.
> >
> > Sure, more performance tuning knobs would be nice.
> >
> 
> my problem was with getting the knobs right at runtime.
> 
> (i think this will need a solution to make tm practically
> useful, there are settings that seem to be sensitive to
> the properties of the underlying hw.. this also seems
> to be a problem for glibc lock elision retry policies.)

Yes, that applies to many tuning settings in lots of places.  And
certainly to TM implementations too :)

> >> sys_futex0
> >> i'm not sure why this has arch specific implementations
> >> for some targets but not others. (syscall is not in the
> >> implementation reserved namespace).
> >
> > Are there archs that support libitm but don't have a definition of this
> > one?
> >
> 
> i thought all targets were supported on linux
> (the global lock based strategies should work)
> i can prepare a sys_futex0 for arm and aarch64.

arm and aarch64 should be supported according to configure.tgt.  Also
see the comment in config/linux/futex_bits.h if you want to change
something there.  I haven't tried arm at all so far.

Re: [Patch] Optimize condition reductions where the result is an integer induction variable

2015-11-12 Thread Richard Biener

On Wed, Nov 11, 2015 at 7:54 PM, Alan Hayward  wrote:
>
>
> On 11/11/2015 13:25, "Richard Biener"  wrote:
>
>>On Wed, Nov 11, 2015 at 1:22 PM, Alan Hayward 
>>wrote:
>>> Hi,
>>> I hoped to post this in time for Monday’s cut off date, but
>>>circumstances
>>> delayed me until today. Hoping if possible this patch will still be able
>>> to go in.
>>>
>>>
>>> This patch builds upon the change for PR65947, and reduces the amount of
>>> code produced in a vectorized condition reduction where operand 2 of the
>>> COND_EXPR is an assignment of a increasing integer induction variable
>>>that
>>> won't wrap.
>>>
>>>
>>> For example (assuming all types are ints), this is a match:
>>>
>>> last = 5;
>>> for (i = 0; i < N; i++)
>>>   if (a[i] < min_v)
>>> last = i;
>>>
>>> Whereas, this is not because the result is based off a memory access:
>>> last = 5;
>>> for (i = 0; i < N; i++)
>>>   if (a[i] < min_v)
>>> last = a[i];
>>>
>>> In the integer induction variable case we can just use a MAX reduction
>>>and
>>> skip all the code I added in my vectorized condition reduction patch -
>>>the
>>> additional induction variables in vectorizable_reduction () and the
>>> additional checks in vect_create_epilog_for_reduction (). From the patch
>>> diff only, it's not immediately obvious that those parts will be skipped
>>> as there is no code changes in those areas.
>>>
>>> The initial value of the induction variable is force set to zero, as any
>>> other value could effect the result of the induction. At the end of the
>>> loop, if the result is zero, then we restore the original initial value.
>>
>>+static bool
>>+is_integer_induction (gimple *stmt, struct loop *loop)
>>
>>is_nonwrapping_integer_induction?
>>
>>+  tree lhs_max = TYPE_MAX_VALUE (TREE_TYPE (gimple_phi_result (stmt)));
>>
>>don't use TYPE_MAX_VALUE.
>>
>>+  /* Check that the induction increments.  */
>>+  if (tree_int_cst_compare (step, size_zero_node) <= 0)
>>+return false;
>>
>>tree_int_cst_sgn (step) == -1
>>
>>+  /* Check that the max size of the loop will not wrap.  */
>>+
>>+  if (! max_loop_iterations (loop, ))
>>+return false;
>>+  /* Convert backedges to iterations.  */
>>+  ni += 1;
>>
>>just use max_stmt_executions (loop, ) which properly checks for
>>overflow
>>of the +1.
>>
>>+  max_loop_value = wi::add (wi::to_widest (base),
>>+   wi::mul (wi::to_widest (step), ni));
>>+
>>+  if (wi::gtu_p (max_loop_value, wi::to_widest (lhs_max)))
>>+return false;
>>
>>you miss a check for the wi::add / wi::mul to overflow.  You can use
>>extra args to determine this.
>>
>>Instead of TYPE_MAX_VALUE use wi::max_value (precision, sign).
>>
>>I wonder if you want to skip all the overflow checks for
>>TYPE_OVERFLOW_UNDEFINED
>>IV types?
>>
>
> Ok with all the above.
>
> Tried using max_value () but this gave me a wide_int instead of a
> widest_int.
> Instead I've replaced with min_precision and GET_MODE_BITSIZE.
>
> Added an extra check for when the type is TYPE_OVERFLOW_UNDEFINED.

+ /* Set the loop-entry arg of the reduction-phi.  */
+
+ if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
+   == INTEGER_INDUC_COND_REDUCTION)

extra vertical space

+ tree zero = build_int_cst ( TREE_TYPE (vec_init_def_type), 0);
+ tree zero_vec = build_vector_from_val (vec_init_def_type, zero);
+

build_zero_cst (vec_init_def_type);

+ else
+   {
+ add_phi_arg (as_a  (phi), vec_init_def,
   loop_preheader_edge (loop), UNKNOWN_LOCATION);
+   }

no {}s around single stmts

+ tree comparez = build2 (EQ_EXPR, boolean_type_node, new_temp, zero);

please no l33t speech

+ tmp = build3 (COND_EXPR, scalar_type, comparez, initial_def,
+   new_temp);
+
+ epilog_stmt = gimple_build_assign (new_scalar_dest, tmp);
+ new_temp = make_ssa_name (new_scalar_dest, epilog_stmt);
+ gimple_assign_set_lhs (epilog_stmt, new_temp);

epilog_stmt = gimple_build_assign (make_ssa_name (new_scalar_dest),
COND_EXPR,
compare, initial_def, new_temp);


+  /* Check that the max size of the loop will not wrap.  */
+
+  if (TYPE_OVERFLOW_UNDEFINED (lhs_type))
+{
+  return (GET_MODE_BITSIZE (TYPE_MODE (lhs_type))
+ >= GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (base;

this mode check will always be true as lhs_type and base are from the
same PHI node.

+  return (wi::min_precision (max_loop_value, TYPE_SIGN (lhs_type))
+ <= GET_MODE_BITSIZE (TYPE_MODE (lhs_type)));

please use TYPE_PRECISION (lhs_type) instead.

Ok with those changes.

Thanks,
Richard.

>
>
> Alan.
>

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2015-11-12 Thread Richard Biener

On Wed, Nov 11, 2015 at 9:38 PM, Jeff Law  wrote:
> On 09/04/2015 11:36 AM, Ajit Kumar Agarwal wrote:
>
>>> diff --git a/gcc/passes.def b/gcc/passes.def
>>> index 6b66f8f..20ddf3d 100644
>>> --- a/gcc/passes.def
>>> +++ b/gcc/passes.def
>>> @@ -82,6 +82,7 @@ along with GCC; see the file COPYING3.  If not see
>>>   NEXT_PASS (pass_ccp);
>>>   /* After CCP we rewrite no longer addressed locals into SSA
>>>  form if possible.  */
>>> +  NEXT_PASS (pass_path_split);
>>>   NEXT_PASS (pass_forwprop);
>>>   NEXT_PASS (pass_sra_early);
>>
>> I can't recall if we've discussed the location of the pass at all.  I'm
>> not objecting to this location, but would like to hear why you chose
>> this particular location in the optimization pipeline.
>
> So returning to the question of where this would live in the optimization
> pipeline and how it interacts with if-conversion and vectorization.

Note that adding passes to the early pipeline that do code duplication
is a no-no.
The early pipeline should be exclusively for things making functions
more suitable
for inlining.

> The concern with moving it to late in the pipeline was that we'd miss
> VRP/DCE/CSE opportunities.  I'm not sure if you're aware, but we actually
> run those passes more than once.  So it would be possible to run path
> splitting after if-conversion & vectorization, but before the second passs
> of VRP & DOM.  But trying that seems to result in something scrambling the
> loop enough that the path splitting opportunity is missed.  That might be
> worth deeper investigation if we can't come up with some kind of heuristics
> to fire or suppress path splitting.

As I still think it is a transform similar to tracer just put it next to that.

But IIRC you mentioned it should enable vectorization or so?  In this case
that's obviously too late.

Richard.

> Other random notes as I look over the code:
>
> Call the pass "path-split", not "path_split".  I don't think we have any
> passes with underscores in their names, dump files, etc.
>
> You factored out the code for transform_duplicate.  When you create new
> functions, they should all have a block comment indicating what they do,
> return values, etc.
>
> I asked you to trim down the #includes in tree-ssa-path-split.c  Most were
> ultimately unnecessary.  The trimmed list is just 11 headers.
>
> Various functions in tree-ssa-path-split.c were missing their block
> comments.  There were several places in tree-ssa-path-split that I felt
> deserved a comment.  While you are familiar with the code, it's likely
> someone else will have to look at and modify this code at some point in the
> future.  The comments help make that easier.
>
> In find_trace_loop_latch_same_as_join_blk, we find the immediate dominator
> of the latch and verify it ends in a conditional.  That's fine.  Then we
> look at the predecessors of the latch to see if one is succeeded only by the
> latch and falls through to the latch.  That is the block we'll end up
> redirecting to a copy of the latch.  Also fine.
>
> Note how there is no testing for the relationship between the immediate
> dominator of the latch and the predecessors of the latch.  ISTM that we can
> have a fairly arbitrary region in the THEN/ELSE arms of the conditional.
> Was this intentional?  Would it be advisable to verify that the THEN/ELSE
> arms are single blocks?  Do we want to verify that neither the THEN/ELSE
> arms transfer control other than to the latch?  Do we want to verify the
> predecessors of the latch are immediate successors of the latch's immediate
> dominator?
>
> The is_feasible_trace routine was still checking if the block had a
> conversion and rejecting it.  I removed that check.  It does seem to me that
> we need an upper limit on the number of statements.  I wonder if we should
> factor out the maximum statements to copy code from jump threading and use
> it for both jump threading and path splitting.
>
> Instead of creating loop with multiple latches, what ever happened to the
> idea of duplicating the latch block twice -- once into each path. Remove the
> control statement in each duplicate.  Then remove everything but the control
> statement in the original latch.
>
>
> I added some direct dump support.  Essentially anytime we split the path, we
> output something like this:
>
> Split path in loop: latch block 9, predecessor 7.
>
> That allows tests in the testsuite to look for the "Split path in loop"
> string rather than inferring the information from the SSA graph update's
> replacement table.  It also allows us to do things like count how many paths
> get split if we have more complex tests.
>
> On the topic of tests.  Is the one you provided something where path
> splitting results in a significant improvement?  From looking at the x86_64
> output, I can see the path splitting transformation occur, but not any
> improvement in the final code.
>
> While the existing test is

Re: [RFC] Remove first_pass_instance from pass_vrp

2015-11-12 Thread Richard Biener

On Thu, Nov 12, 2015 at 12:37 PM, Tom de Vries  wrote:
> Hi,
>
> [ See also related discussion at
> https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00452.html ]
>
> this patch removes the usage of first_pass_instance from pass_vrp.
>
> the patch:
> - limits itself to pass_vrp, but my intention is to remove all
>   usage of first_pass_instance
> - lacks an update to gdbhooks.py
>
> Modifying the pass behaviour depending on the instance number, as
> first_pass_instance does, break compositionality of the pass list. In other
> words, adding a pass instance in a pass list may change the behaviour of
> another instance of that pass in the pass list. Which obviously makes it
> harder to understand and change the pass list. [ I've filed this issue as
> PR68247 - Remove pass_first_instance ]
>
> The solution is to make the difference in behaviour explicit in the pass
> list, and no longer change behaviour depending on instance number.
>
> One obvious possible fix is to create a duplicate pass with a different
> name, say 'pass_vrp_warn_array_bounds':
> ...
>   NEXT_PASS (pass_vrp_warn_array_bounds);
>   ...
>   NEXT_PASS (pass_vrp);
> ...
>
> But, AFAIU that requires us to choose a different dump-file name for each
> pass. And choosing vrp1 and vrp2 as new dump-file names still means that
> -fdump-tree-vrp no longer works (which was mentioned as drawback here:
> https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00453.html ).
>
> This patch instead makes pass creation parameterizable. So in the pass list,
> we use:
> ...
>   NEXT_PASS_WITH_ARG (pass_vrp, true /* warn_array_bounds_p */);
>   ...
>   NEXT_PASS_WITH_ARG (pass_vrp, false /* warn_array_bounds_p */);
> ...
>
> This approach gives us clarity in the pass list, similar to using a
> duplicate pass 'pass_vrp_warn_array_bounds'.
>
> But it also means -fdump-tree-vrp still works as before.
>
> Good idea? Other comments?

It's good to get rid of the first_pass_instance hack.

I can't comment on the AWK, leaving that to others.  Syntax-wise I'd hoped
we can just use NEXT_PASS with the extra argument being optional...

I don't see the need for giving clone_with_args a new name, just use an overload
of clone ()?  [ideally C++ would allow us to say that only one overload may be
implemented]

Thanks,
Richard.

> Thanks,
> - Tom

Re: [PATCH, PR tree-optimization/PR68305] Support masked COND_EXPR in SLP

2015-11-12 Thread Richard Biener

On Thu, Nov 12, 2015 at 1:03 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patch fixes a way operand is chosen by its num for COND_EXPR.  
> Bootstrapped and regtested on x86_64-unknown-linux-gnu.  OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2015-11-12  Ilya Enkovich  
>
> PR tree-optimization/68305
> * tree-vect-slp.c (vect_get_constant_vectors): Support
> COND_EXPR with SSA_NAME as a condition.
>
> gcc/testsuite/
>
> 2015-11-12  Ilya Enkovich  
>
> PR tree-optimization/68305
> * gcc.dg/vect/pr68305.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr68305.c 
> b/gcc/testsuite/gcc.dg/vect/pr68305.c
> new file mode 100644
> index 000..fde3db7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr68305.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +/* { dg-additional-options "-mavx2" { target avx_runtime } } */
> +
> +int a, b;
> +
> +void
> +fn1 ()
> +{
> +  int c, d;
> +  for (; b; b++)
> +a = a ^ !c ^ !d;
> +}
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 9d97140..9402474 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2738,18 +2738,20 @@ vect_get_constant_vectors (tree op, slp_tree slp_node,
>   switch (code)
> {
>   case COND_EXPR:
> -   if (op_num == 0 || op_num == 1)
> - {
> -   tree cond = gimple_assign_rhs1 (stmt);
> +   {
> + tree cond = gimple_assign_rhs1 (stmt);
> + if (TREE_CODE (cond) == SSA_NAME)
> +   op = gimple_op (stmt, op_num + 1);
> + else if (op_num == 0 || op_num == 1)
> op = TREE_OPERAND (cond, op_num);
> - }
> -   else
> - {
> -   if (op_num == 2)
> - op = gimple_assign_rhs2 (stmt);
> -   else
> - op = gimple_assign_rhs3 (stmt);
> - }
> + else
> +   {
> + if (op_num == 2)
> +   op = gimple_assign_rhs2 (stmt);
> + else
> +   op = gimple_assign_rhs3 (stmt);
> +   }
> +   }
> break;
>
>   case CALL_EXPR:

[Ada] Warn when a non-imported constant overlays a constant

2015-11-12 Thread Arnaud Charlet

The compiler warns when a variable overlays a constant because of an address
clause on the former.  This change makes the compiler issue the same warning
when a non-imported constant overlays a constant.

The patch also removes an old pessimization whereby overlaid objects would
be treated as volatile by the compiler in some circumstances, for example
preventing them from being put into read-only memory if they are constant.

The compiler must issue the warning:

consovl3.adb:4:03: warning: constant "C" may be modified via address clause at
line 5

on the followig code:

with Q; use Q;

procedure Consovl3 is
  A : constant Natural := 0;
  for A'Address use C'Address;
begin
  null;
end;
package Q is

  C : constant Natural := 1;

end Q;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Eric Botcazou  

* einfo.ads (Overlays_Constant): Document usage for E_Constant.
* freeze.adb (Warn_Overlay): Small reformatting.
(Check_Address_Clause): Deal specifically with deferred
constants.  For a variable or a non-imported constant
overlaying a constant object and with initialization value,
either remove the initialization or issue a warning.  Fix a
couple of typos.
* sem_util.adb (Note_Possible_Modification): Overhaul the condition for
the warning on modified constants and use Find_Overlaid_Entity instead
of doing it manually.
* sem_ch13.adb (Analyze_Attribute_Definition_Clause): Compute and
set Overlays_Constant once on entry.  Do not treat the overlaid
entity as volatile.  Do not issue the warning on modified
constants here.
* gcc-interface/decl.c (gnat_to_gnu_entity) : Remove
over-restrictive condition for the special treatment of deferred
constants.
: Remove obsolete associated code.

Index: einfo.ads
===
--- einfo.ads   (revision 230223)
+++ einfo.ads   (working copy)
@@ -3638,8 +3638,9 @@
 -- Points to the component in the base type.
 
 --Overlays_Constant (Flag243)
---   Defined in all entities. Set only for a variable for which there is
---   an address clause which causes the variable to overlay a constant.
+--   Defined in all entities. Set only for E_Constant or E_Variable for
+--   which there is an address clause which causes the entity to overlay
+--   a constant object.
 
 --Overridden_Operation (Node26)
 --   Defined in subprograms. For overriding operations, points to the
Index: freeze.adb
===
--- freeze.adb  (revision 230223)
+++ freeze.adb  (working copy)
@@ -207,10 +207,7 @@
--  this to have a Freeze_Node, so ensure it doesn't. Do the same for any
--  Full_View or Corresponding_Record_Type.
 
-   procedure Warn_Overlay
- (Expr : Node_Id;
-  Typ  : Entity_Id;
-  Nam  : Node_Id);
+   procedure Warn_Overlay (Expr : Node_Id; Typ : Entity_Id; Nam : Node_Id);
--  Expr is the expression for an address clause for entity Nam whose type
--  is Typ. If Typ has a default initialization, and there is no explicit
--  initialization in the source declaration, check whether the address
@@ -598,16 +595,25 @@
--
 
procedure Check_Address_Clause (E : Entity_Id) is
-  Addr   : constant Node_Id:= Address_Clause (E);
+  Addr   : constant Node_Id   := Address_Clause (E);
+  Typ: constant Entity_Id := Etype (E);
+  Decl   : Node_Id;
   Expr   : Node_Id;
-  Decl   : constant Node_Id:= Declaration_Node (E);
-  Loc: constant Source_Ptr := Sloc (Decl);
-  Typ: constant Entity_Id  := Etype (E);
+  Init   : Node_Id;
   Lhs: Node_Id;
   Tag_Assign : Node_Id;
 
begin
   if Present (Addr) then
+
+ --  For a deferred constant, the initialization value is on full view
+
+ if Ekind (E) = E_Constant and then Present (Full_View (E)) then
+Decl := Declaration_Node (Full_View (E));
+ else
+Decl := Declaration_Node (E);
+ end if;
+
  Expr := Expression (Addr);
 
  if Needs_Constant_Address (Decl, Typ) then
@@ -656,29 +662,72 @@
 Warn_Overlay (Expr, Typ, Name (Addr));
  end if;
 
- if Present (Expression (Decl)) then
+ Init := Expression (Decl);
 
+ --  If a variable, or a non-imported constant, overlays a constant
+ --  object and has an initialization value, then the initialization
+ --  may end up writing into read-only memory. Detect the cases of
+ --  statically identical values and remove the initialization. In
+ --  the other cases, give a warning. We will give other warnings
+ --  later for the variable if it is assigned.
+
+ if (Ekind (E) =

[Ada] Contract_Cases on entries

2015-11-12 Thread Arnaud Charlet

This patch implements apect/pragma Contract_Cases on enties.


-- Source --


--  tracker.ads

package Tracker is
   type Check_Kind is
 (Pre,
  Refined_Post,
  Post,
  Conseq_1,
  Conseq_2);

   type Tested_Array is array (Check_Kind) of Boolean;
   --  A value of "True" indicates that a check has been tested

   function Greater_Than
 (Kind : Check_Kind;
  Val  : Natural;
  Exp  : Natural) return Boolean;
   --  Determine whether value Val is greater than expected value Exp. The
   --  routine also updates the history for check of kind Kind. Duplicate
   --  attempts to modify the history are flagged as errors.

   procedure Reset;
   --  Reset the history

   procedure Verify (Exp : Tested_Array);
   --  Verify whether expected tests Exp were indeed checked. Emit an error if
   --  this is not the case.
end Tracker;

--  tacker.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Tracker is
   History : array (Check_Kind) of Boolean := (others => False);
   --  The history of performed checked. A value of "True" indicates that a
   --  check was performed.

   --
   -- Greater_Than --
   --

   function Greater_Than
 (Kind : Check_Kind;
  Val  : Natural;
  Exp  : Natural) return Boolean
   is
   begin
  if History (Kind) then
 Put_Line ("  ERROR: " & Kind'Img & " tested multiple times");
  else
 History (Kind) := True;
  end if;

  return Val > Exp;
   end Greater_Than;

   ---
   -- Reset --
   ---

   procedure Reset is
   begin
  History := (others => False);
   end Reset;

   
   -- Verify --
   

   procedure Verify (Exp : Tested_Array) is
   begin
  for Index in Check_Kind'Range loop
 if Exp (Index) and not History (Index) then
Put_Line ("  ERROR: " & Index'Img & " was not tested");
 elsif not Exp (Index) and History (Index) then
Put_Line ("  ERROR: " & Index'Img & " was tested");
 end if;
  end loop;
   end Verify;
end Tracker;

--  sync_contracts.ads

with Tracker; use Tracker;

package Sync_Contracts
  with SPARK_Mode,
   Abstract_State => State
is
   protected type Prot_Typ is
  entry Prot_Entry (Input : Natural; Output : out Natural)
with Global  => (Input => State),
 Depends => ((Prot_Typ, Output) => (State, Prot_Typ, Input)),
 Pre  => Greater_Than (Pre,  Input,  1),
 Post => Greater_Than (Post, Output, 4),
 Contract_Cases =>
   (Input < 5 => True,
Input = 5 => Greater_Than (Conseq_1, Output, 6),
Input = 6 => Greater_Than (Conseq_2, Output, 7),
Input > 6 => False);

  procedure Prot_Proc (Input : Natural; Output : out Natural)
with Pre  => Greater_Than (Pre , Input,  1),
 Post => Greater_Than (Post, Output, 4),
 Contract_Cases =>
   (Input < 5 => True,
Input = 5 => Greater_Than (Conseq_1, Output, 6),
Input = 6 => Greater_Than (Conseq_2, Output, 7),
Input > 6 => False);

  function Prot_Func (Input : Natural) return Natural
with Pre  => Greater_Than (Pre , Input, 1),
 Post => Greater_Than (Post, Prot_Func'Result, 4),
 Contract_Cases =>
   (Input < 5 => True,
Input = 5 => Greater_Than (Conseq_1, Prot_Func'Result, 6),
Input = 6 => Greater_Than (Conseq_2, Prot_Func'Result, 7),
Input > 6 => False);
   end Prot_Typ;

   task type Tsk_Typ is
  entry Tsk_Entry (Input : Natural; Output : out Natural)
with Pre  => Greater_Than (Pre , Input,  1),
 Post => Greater_Than (Post, Output, 4),
 Contract_Cases =>
   (Input < 5 => True,
Input = 5 => Greater_Than (Conseq_1, Output, 6),
Input = 6 => Greater_Than (Conseq_2, Output, 7),
Input > 6 => False);
   end Tsk_Typ;
end Sync_Contracts;

--  sync_contracts.adb

package body Sync_Contracts
  with SPARK_Mode,
   Refined_State => (State => Var)
is
   Var : Integer := 1;

   protected body Prot_Typ is
  entry Prot_Entry (Input : Natural; Output : out Natural)
with Refined_Global  => (Input => Var),
 Refined_Depends => ((Prot_Typ, Output) => (Var, Prot_Typ, Input)),
 Refined_Post => Greater_Than (Refined_Post, Output, 3)
when True
  is
  begin
 Output := Input + 1;
  end Prot_Entry;

  procedure Prot_Proc (Input : Natural; Output : out Natural)
with Refined_Post => Greater_Than (Refined_Post, Output, 3)
  is
  begin
 Output := Input + 1;
  end Prot_Proc;

  function Prot_Func (Input : Natural) return Natural
with Refined_Post => Greater_Than (Refined_Post, Prot_Func'Result, 3)
  is
  begin
 return Input +

Re: [hsa 4/12] OpenMP lowering/expansion changes (gridification)

2015-11-12 Thread Jakub Jelinek

On Thu, Nov 05, 2015 at 10:57:33PM +0100, Martin Jambor wrote:
> the patch in this email contains the changes to make our OpenMP
> lowering and expansion machinery produce GPU kernels for a certain
> limited class of loops.  The plan is to make that class quite a big
> bigger, but only the following is ready for submission now.
> 
> Basically, whenever the compiler configured for HSAIL generation
> encounters the following pattern:
> 
>   #pragma omp target
>   #pragma omp teams thread_limit(workgroup_size) // thread_limit is optional
>   #pragma omp distribute parallel for firstprivate(n) private(i) 
> other_sharing_clauses()
> for (i = 0; i < n; i++)
>   some_loop_body

Do you support only lb 0 or any constant?  Only step 1?  Can the
b be constant, or just a variable?  If you need the number of iterations
computed before GOMP_target_ext, supposedly you also need to check that
n can't change in between target and the distribute (e.g. if it is
addressable or global var) and there are some statements in between.

What about schedule or dist_schedule clauses?  Only schedule(auto) or
missing schedule guarantees you you can distribute the work among the
threads any way the compiler wants.
dist_schedule is always static, but could have different chunk_size.

The current int num_threads, int thread_limit GOMP_target_ext arguments
perhaps could be changed to something like int num_args, long *args,
where args[0] would be the current num_threads and args[1] current
thread_limit, and if any offloading target that might benefit from knowing
the number of iterations of distribute parallel for that is the only
important statement inside, you could perhaps pass it as args[2] and pass
3 instead of 2 to num_args.  That could be something kind of generic
rather than HSA specific, and extensible.  But, looking at your
kernel_launch structure, you want something like multiple dimensions and
compute each dimension separately rather than combine (collapse) all
dimensions together, which is what OpenMP expansion does right now.

> While we have also been experimenting quite a bit with dynamic
> parallelism, we have only been able to achieve any good performance
> via this process of gridification.  The user can be notified whether a
> particular target construct was gridified or not via our process of
> dumping notes, which however only appear in the detailed dump.  I am
> seriously considering emitting some kind of warning, when HSA-enabled
> compiler is about to produce a non-gridified target code.

But then it would warn pretty much on all of libgomp testsuite with target
constructs in them...

> @@ -547,13 +548,13 @@ DEF_FUNCTION_TYPE_7 
> (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,

> --- a/gcc/fortran/types.def
> +++ b/gcc/fortran/types.def
> @@ -145,6 +145,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, 
> BT_VOLATILE_PTR, BT_I2, BT
>  DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, 
> BT_I4, BT_INT)
>  DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, 
> BT_I8, BT_INT)
>  DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, 
> BT_I16, BT_INT)
> +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_PTR, BT_VOID, BT_PTR, BT_INT, BT_PTR)
>  
>  DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT,
>   BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT)
> @@ -215,9 +216,9 @@ DEF_FUNCTION_TYPE_7 
> (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,
>  DEF_FUNCTION_TYPE_8 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_LONG_UINT,
>BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
>BT_LONG, BT_LONG, BT_LONG, BT_LONG, BT_UINT)
> -DEF_FUNCTION_TYPE_8 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR,
> +DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
>BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
> -  BT_PTR, BT_PTR, BT_UINT, BT_PTR)
> +  BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)

You'd need to move it if you add arguments (but as I said on the other
patch, this won't really apply on top of the trunk anyway).

> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -153,6 +153,7 @@ enum gf_mask {
>  GF_OMP_FOR_KIND_TASKLOOP = 2,
>  GF_OMP_FOR_KIND_CILKFOR = 3,
>  GF_OMP_FOR_KIND_OACC_LOOP= 4,
> +GF_OMP_FOR_KIND_KERNEL_BODY = 5,
>  /* Flag for SIMD variants of OMP_FOR kinds.  */
>  GF_OMP_FOR_SIMD  = 1 << 3,
>  GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0,
> @@ -621,8 +622,24 @@ struct GTY((tag("GSS_OMP_FOR")))
>/* [ WORD 11 ]
>   Pre-body evaluated before the loop body begins.  */
>gimple_seq pre_body;
> +
> +  /* [ WORD 12 ]
> + If set, this statement is part of a gridified kernel, its clauses need 
> to
> + be scanned and lowered but the statement should be discarded after
> + lowering.  */
> +  bool kernel_phony;

A bool flag is better put as a GF_OMP_* flag, there are

Re: [PATCH 04/N] Fix big memory leak in ix86_valid_target_attribute_p

2015-11-12 Thread Richard Biener

On Thu, Nov 12, 2015 at 11:03 AM, Martin Liška  wrote:
> Hello.
>
> Following patch was a bit negotiated with Jakub and can save a huge amount of 
> memory in cases
> where target attributes are heavily utilized.
>
> Can bootstrap and survives regression tests on x86_64-linux-pc.
>
> Ready for trunk?

+static bool opts_obstack_initialized = false;
+
+/* Initialize opts_obstack if not initialized.  */
+
+void
+init_opts_obstack (void)
+{
+  if (!opts_obstack_initialized)
+{
+  opts_obstack_initialized = true;
+  gcc_obstack_init (_obstack);

you can move the static global to function scope.

Ok with that change.

Btw, don't other targets need a similar adjustment to their hook?
Grepping shows arm and nios2.

Thanks,
Richard.


> Thanks,
> Martin

[Ada] Missing detection of elaboration dependency

2015-11-12 Thread Arnaud Charlet

This patch modifies the elaboration circuitry to detect an issue in SPARK
where an object in package P of a private type in package T subject to
pragma Default_Initial_Condition is default initialized and package P
lacks Elaborate_All (T).


-- Source --


--  pack.ads

package Pack with SPARK_Mode is
   type Elab_Typ is private
 with Default_Initial_Condition => Get_Val (Elab_Typ) = Expect_Val;

   type False_Typ is private
 with Default_Initial_Condition => False;

   type True_Typ is private
 with Default_Initial_Condition => True;

   function Expect_Val return Integer;
   function Get_Val (Obj : Elab_Typ) return Integer;

private
   type Elab_Typ is record
  Comp : Integer;
   end record;

   type False_Typ is null record;
   type True_Typ is null record;
end Pack;

--  pack.adb

package body Pack with SPARK_Mode is
   function Expect_Val return Integer is
   begin
  return 1234;
   end Expect_Val;

   function Get_Val (Obj : Elab_Typ) return Integer is
   begin
  return Obj.Comp;
   end Get_Val;
end Pack;

--  main_pack.ads

with Pack; use Pack;

package Main_Pack with SPARK_Mode is
   Obj_1 : Elab_Typ;
   Obj_2 : False_Typ;
   Obj_3 : True_Typ;
end Main_Pack;


-- Compilation and output --


$ gcc -c -gnata main_pack.ads
main_pack.ads:4:04: call to Default_Initial_Condition during elaboration in
  SPARK
main_pack.ads:4:04: Elaborate_All pragma required for "Pack"

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Hristian Kirtchev  

* sem_elab.adb (Check_A_Call): Add new variable
Is_DIC_Proc. Report elaboration issue in SPARK concerning calls
to source subprograms or nontrivial Default_Initial_Condition
procedures. Add specialized error message to avoid outputting
the internal name of the Default_Initial_Condition procedure.
* sem_util.ads, sem_util.adb
(Is_Non_Trivial_Default_Init_Cond_Procedure): New routine.

Index: sem_util.adb
===
--- sem_util.adb(revision 230235)
+++ sem_util.adb(working copy)
@@ -12362,12 +12362,50 @@
   end if;
end Is_Local_Variable_Reference;
 
+   
+   -- Is_Non_Trivial_Default_Init_Cond_Procedure --
+   
+
+   function Is_Non_Trivial_Default_Init_Cond_Procedure
+ (Id : Entity_Id) return Boolean
+   is
+  Body_Decl : Node_Id;
+  Stmt : Node_Id;
+
+   begin
+  if Ekind (Id) = E_Procedure
+and then Is_Default_Init_Cond_Procedure (Id)
+  then
+ Body_Decl :=
+   Unit_Declaration_Node
+ (Corresponding_Body (Unit_Declaration_Node (Id)));
+
+ --  The body of the Default_Initial_Condition procedure must contain
+ --  at least one statement, otherwise the generation of the subprogram
+ --  body failed.
+
+ pragma Assert (Present (Handled_Statement_Sequence (Body_Decl)));
+
+ --  To qualify as non-trivial, the first statement of the procedure
+ --  must be a check in the form of an if statement. If the original
+ --  Default_Initial_Condition expression was folded, then the first
+ --  statement is not a check.
+
+ Stmt := First (Statements (Handled_Statement_Sequence (Body_Decl)));
+
+ return
+   Nkind (Stmt) = N_If_Statement
+ and then Nkind (Original_Node (Stmt)) = N_Pragma;
+  end if;
+
+  return False;
+   end Is_Non_Trivial_Default_Init_Cond_Procedure;
+
-
-- Is_Object_Reference --
-
 
function Is_Object_Reference (N : Node_Id) return Boolean is
-
   function Is_Internally_Generated_Renaming (N : Node_Id) return Boolean;
   --  Determine whether N is the name of an internally-generated renaming
 
Index: sem_util.ads
===
--- sem_util.ads(revision 230223)
+++ sem_util.ads(working copy)
@@ -1433,6 +1433,12 @@
--  parameter of the current enclosing subprogram.
--  Why are OUT parameters not considered here ???
 
+   function Is_Non_Trivial_Default_Init_Cond_Procedure
+ (Id : Entity_Id) return Boolean;
+   --  Determine whether entity Id denotes the procedure which verifies the
+   --  assertion expression of pragma Default_Initial_Condition and if it does,
+   --  the encapsulated expression is non-trivial.
+
function Is_Object_Reference (N : Node_Id) return Boolean;
--  Determines if the tree referenced by N represents an object. Both
--  variable and constant objects return True (compare Is_Variable).
Index: sem_elab.adb
===
--- sem_elab.adb(revision 230223)
+++ sem_elab.adb(working copy)
@@

[PATCH][AArch64] Documentation fix for -fpic

2015-11-12 Thread Szabolcs Nagy


The documentation for -fpic and -fPIC explicitly mentions some targets
where the difference matters, but not AArch64.  Specifying the GOT size
limit is not entirely correct as it can depend on the -mcmodel setting,
but probably better than leaving the impression that -fpic vs -fPIC does
not matter on AArch64.

ChangeLog:

2015-11-12  Szabolcs Nagy  

* doc/invoke.texi (-fpic): Add the AArch64 limit.
(-fPIC): Add AArch64.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0121832..f925fe0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -23951,7 +23951,7 @@ loader is not part of GCC; it is part of the operating system).  If
 the GOT size for the linked executable exceeds a machine-specific
 maximum size, you get an error message from the linker indicating that
 @option{-fpic} does not work; in that case, recompile with @option{-fPIC}
-instead.  (These maximums are 8k on the SPARC and 32k
+instead.  (These maximums are 8k on the SPARC, 28k on AArch64 and 32k
 on the m68k and RS/6000.  The x86 has no such limit.)
 
 Position-independent code requires special support, and therefore works
@@ -23966,7 +23966,7 @@ are defined to 1.
 @opindex fPIC
 If supported for the target machine, emit position-independent code,
 suitable for dynamic linking and avoiding any limit on the size of the
-global offset table.  This option makes a difference on the m68k,
+global offset table.  This option makes a difference on the AArch64, m68k,
 PowerPC and SPARC@.
 
 Position-independent code requires special support, and therefore works

[Ada] Crash on illegal selected component in synchronized body.

2015-11-12 Thread Arnaud Charlet

The prefix of a selected component in a synchronized body cannot denote
a component of the synchronized type unless the prefix is an entity name.
This was not properly rejected before.

Compiling bakery.adb must yield:

bakery.adb:44:35: invalid reference to internal operation of some object
   of type "Bakery_Instance_Task"

---
procedure Bakery is

   N: Natural := 10; -- Number of Processes [Customers]

   type Integer_Array is array (1 .. N) of Integer;

   type Ticket_And_Queue_Number is record
  R : Natural; -- Ticket Number [Lamport 'Number']
  A : Natural; -- Queue Number  [Lamport 'Choosing']
   end record;

   task type Bakery_Instance_Task is
  entry Initialize(ID : Natural);
   end Bakery_Instance_Task;

   Bakery_Array : array (1 .. N) of Bakery_Instance_Task;

   task body Bakery_Instance_Task is

  R   : Natural; -- This task's current ticket number [Lamport 'Number']
  A   : Integer_Array := (1 .. N => 0);
  ID0 : Natural;

  TQN : Ticket_And_Queue_Number;

  function Read_TQN(J : in Natural) return Ticket_And_Queue_Number is
 TQN : Ticket_And_Queue_Number;
  begin
 TQN := (R => R,
 A => A(J));
 return TQN;
  end Read_TQN;
   begin
  accept Initialize(ID : Natural) do
 R := 0;
 A := (1 .. N => 0);
 ID0 := ID;
  end Initialize;
  -- Start
  R := 1;
  A(ID0) := 1;
  for J in 1 .. N loop
 if J /= ID0 then
TQN := Bakery_Array(J).Read_TQN(J => J);
 end if;
  end loop;
   end Bakery_Instance_Task;

begin
   for I in 1 .. N loop
  Bakery_Array(I).Initialize(ID => I);
   end loop;
end Bakery;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-11-12  Ed Schonberg  

* sem_ch4.adb (Analyze_Selected_Component): Diagnose an attempt
to reference an internal entity from a synchronized type from
within the body of that type, when the prefix of the selected
component is not the current instance.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 230241)
+++ sem_ch4.adb (working copy)
@@ -4655,6 +4655,23 @@
Comp = First_Private_Entity (Base_Type (Prefix_Type));
  end loop;
 
+ --  If the scope is a current instance, the prefix cannot be an
+ --  expression of the same type (that would represent an attempt
+ --  to reach an internal operation of another synchronized object).
+ --  This is legal if prefix is an access to such type and there is
+ --  a dereference.
+
+ if In_Scope
+   and then not Is_Entity_Name (Name)
+   and then Nkind (Name) /= N_Explicit_Dereference
+ then
+Error_Msg_NE ("invalid reference to internal operation "
+   & "of some object of type&", N, Type_To_Use);
+Set_Entity (Sel, Any_Id);
+Set_Etype (Sel, Any_Type);
+return;
+ end if;
+
  --  If there is no visible entity with the given name or none of the
  --  visible entities are plausible interpretations, check whether
  --  there is some other primitive operation with that name.

Re: [patch] Fix PR target/67265

2015-11-12 Thread Eric Botcazou

> Ok if it passes testing.

Thanks, it did so I installed the fix yesterday but further testing then 
revealed an oversight: the following assertion in ix86_adjust_stack_and_probe

  gcc_assert (cfun->machine->fs.cfa_reg != stack_pointer_rtx);

will now evidently trigger (simple testcase attached).

I can sched some light on it here since I wrote the code: the initial version 
of ix86_adjust_stack_and_probe didn't bother generating CFI because it only 
manipulates the stack pointer and the CFA register was guaranteed to be the 
frame pointer until yesterday, so I put the assertion to check this guarantee.
Then Richard H. enhanced the CFI machinery to always track stack adjustments 
(IIRC this was a prerequisite for your implementation of shrink-wrapping) so I 
added code to generate CFI:

  /* Even if the stack pointer isn't the CFA register, we need to correctly
 describe the adjustments made to it, in particular differentiate the
 frame-related ones from the frame-unrelated ones.  */
  if (size > 0)

To sum up, I think that the assertion is obsolete and can be removed without 
further ado; once done, the compiler generates correct CFI for the testcase.
So I installed the following one-liner as obvious after testing on x86-64.


2015-11-12  Eric Botcazou  

PR target/67265
* config/i386/i386.c (ix86_adjust_stack_and_probe): Remove obsolete
assertion on the CFA register.


2015-11-12  Eric Botcazou  

* gcc.target/i386/pr67265-2.c: New test.

-- 
Eric BotcazouIndex: config/i386/i386.c
===
--- config/i386/i386.c	(revision 230204)
+++ config/i386/i386.c	(working copy)
@@ -12245,8 +12245,6 @@ ix86_adjust_stack_and_probe (const HOST_
   release_scratch_register_on_entry ();
 }
 
-  gcc_assert (cfun->machine->fs.cfa_reg != stack_pointer_rtx);
-
   /* Even if the stack pointer isn't the CFA register, we need to correctly
  describe the adjustments made to it, in particular differentiate the
  frame-related ones from the frame-unrelated ones.  */
/* { dg-do compile } */
/* { dg-options "-O -fstack-check" } */

void foo (int n)
{
  volatile char arr[64 * 1024];

  arr[n] = 1;
}

Re: [PATCH][AArch64][v2] Improve comparison with complex immediates followed by branch/cset

2015-11-12 Thread James Greenhalgh

On Tue, Nov 03, 2015 at 03:43:24PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> Bootstrapped and tested on aarch64.
> 
> Ok for trunk?

Comments in-line.

> 
> Thanks,
> Kyrill
> 
> 
> 2015-11-03  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64.md (*condjump): Rename to...
> (condjump): ... This.
> (*compare_condjump): New define_insn_and_split.
> (*compare_cstore_insn): Likewise.
> (*cstore_insn): Rename to...
> (aarch64_cstore): ... This.
> * config/aarch64/iterators.md (CMP): Handle ne code.
> * config/aarch64/predicates.md (aarch64_imm24): New predicate.
> 
> 2015-11-03  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/cmpimm_branch_1.c: New test.
> * gcc.target/aarch64/cmpimm_cset_1.c: Likewise.

> commit 7df013a391532f39932b80c902e3b4bbd841710f
> Author: Kyrylo Tkachov 
> Date:   Mon Sep 21 10:56:47 2015 +0100
> 
> [AArch64] Improve comparison with complex immediates
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 126c9c2..1bfc870 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -369,7 +369,7 @@ (define_expand "mod3"
>}
>  )
>  
> -(define_insn "*condjump"
> +(define_insn "condjump"
>[(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
>   [(match_operand 1 "cc_register" "") (const_int 0)])
>  (label_ref (match_operand 2 "" ""))
> @@ -394,6 +394,40 @@ (define_insn "*condjump"
> (const_int 1)))]
>  )
>  
> +;; For a 24-bit immediate CST we can optimize the compare for equality
> +;; and branch sequence from:
> +;; mov   x0, #imm1
> +;; movk  x0, #imm2, lsl 16 /* x0 contains CST.  */
> +;; cmp   x1, x0
> +;; b .Label

This would be easier on the eyes if you were to indent the code sequence.

+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; b .Label
+;; into the shorter:
+;; sub x0, #(CST & 0xfff000)

> +;; into the shorter:
> +;; sub   x0, #(CST & 0xfff000)
> +;; subs  x0, #(CST & 0x000fff)

These instructions are not valid (2 operand sub/subs?) can you write them
out fully for this comment so I can see the data flow?

> +;; b .Label
> +(define_insn_and_split "*compare_condjump"
> +  [(set (pc) (if_then_else (EQL
> +   (match_operand:GPI 0 "register_operand" "r")
> +   (match_operand:GPI 1 "aarch64_imm24" "n"))
> +(label_ref:P (match_operand 2 "" ""))
> +(pc)))]
> +  "!aarch64_move_imm (INTVAL (operands[1]), mode)
> +   && !aarch64_plus_operand (operands[1], mode)"
> +  "#"
> +  "&& true"
> +  [(const_int 0)]
> +  {
> +HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff;
> +HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000;
> +rtx tmp = gen_reg_rtx (mode);

Can you guarantee we can always create this pseudo? What if we're a
post-register-allocation split?

> +emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm)));
> +emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm)));
> +rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM);
> +rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx);
> +emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2]));
> +DONE;
> +  }
> +)
> +
>  (define_expand "casesi"
>[(match_operand:SI 0 "register_operand" ""); Index
> (match_operand:SI 1 "const_int_operand" "")   ; Lower bound
> @@ -2898,7 +2932,7 @@ (define_expand "cstore4"
>"
>  )
>  
> -(define_insn "*cstore_insn"
> +(define_insn "aarch64_cstore"
>[(set (match_operand:ALLI 0 "register_operand" "=r")
>   (match_operator:ALLI 1 "aarch64_comparison_operator"
>[(match_operand 2 "cc_register" "") (const_int 0)]))]
> @@ -2907,6 +2941,39 @@ (define_insn "*cstore_insn"
>[(set_attr "type" "csel")]
>  )
>  
> +;; For a 24-bit immediate CST we can optimize the compare for equality
> +;; and branch sequence from:
> +;; mov   x0, #imm1
> +;; movk  x0, #imm2, lsl 16 /* x0 contains CST.  */
> +;; cmp   x1, x0
> +;; cset  x2, 
> +;; into the shorter:
> +;; sub   x0, #(CST & 0xfff000)
> +;; subs  x0, #(CST & 0x000fff)
> +;; cset x1, .

Same comments as above regarding formatting and making this a valid set
of instructions.

> +(define_insn_and_split "*compare_cstore_insn"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> +  (EQL:GPI (match_operand:GPI 1 "register_operand" "r")
> +   (match_operand:GPI 2 "aarch64_imm24" "n")))]
> +  "!aarch64_move_imm (INTVAL (operands[2]), mode)
> +   && !aarch64_plus_operand (operands[2], mode)"
> +  "#"
> +  "&& true"
> +  [(const_int 0)]
> +  {
> +HOST_WIDE_INT lo_imm = UINTVAL (operands[2]) & 0xfff;
> +HOST_WIDE_INT

[PATCH] Fix PR68306

2015-11-12 Thread Richard Biener


The following fixes PR68306, an ordering issue with my last BB
vectorization patch.  Fixed by removing that ordering requirement.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2015-11-12  Richard Biener  

PR tree-optimization/68306
* tree-vect-data-refs.c (verify_data_ref_alignment): Remove
relevant and vectorizable checks here.
(vect_verify_datarefs_alignment): Add relevant check here.

* gcc.dg/pr68306.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 230216)
--- gcc/tree-vect-data-refs.c   (working copy)
*** verify_data_ref_alignment (data_referenc
*** 909,922 
gimple *stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
  
!   if (!STMT_VINFO_RELEVANT_P (stmt_info))
! return true;
! 
!   /* For interleaving, only the alignment of the first access matters. 
!  Skip statements marked as not vectorizable.  */
!   if ((STMT_VINFO_GROUPED_ACCESS (stmt_info)
!&& GROUP_FIRST_ELEMENT (stmt_info) != stmt)
!   || !STMT_VINFO_VECTORIZABLE (stmt_info))
  return true;
  
/* Strided accesses perform only component accesses, alignment is
--- 889,897 
gimple *stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
  
!   /* For interleaving, only the alignment of the first access matters.   */
!   if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
!   && GROUP_FIRST_ELEMENT (stmt_info) != stmt)
  return true;
  
/* Strided accesses perform only component accesses, alignment is
*** vect_verify_datarefs_alignment (loop_vec
*** 965,972 
unsigned int i;
  
FOR_EACH_VEC_ELT (datarefs, i, dr)
! if (! verify_data_ref_alignment (dr))
!   return false;
  
return true;
  }
--- 940,954 
unsigned int i;
  
FOR_EACH_VEC_ELT (datarefs, i, dr)
! {
!   gimple *stmt = DR_STMT (dr);
!   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
! 
!   if (!STMT_VINFO_RELEVANT_P (stmt_info))
!   continue;
!   if (! verify_data_ref_alignment (dr))
!   return false;
! }
  
return true;
  }
Index: gcc/testsuite/gcc.dg/pr68306.c
===
*** gcc/testsuite/gcc.dg/pr68306.c  (revision 0)
--- gcc/testsuite/gcc.dg/pr68306.c  (working copy)
***
*** 0 
--- 1,10 
+ /* { dg-do compile } */
+ /* { dg-options "-O3" } */
+ 
+ enum powerpc_pmc_type { PPC_PMC_IBM };
+ struct {
+ unsigned num_pmcs;
+ enum powerpc_pmc_type pmc_type;
+ } a;
+ enum powerpc_pmc_type b;
+ void fn1() { a.num_pmcs = a.pmc_type = b; }

Re: [PATCH 0/2] Levenshtein-based suggestions (v3)

2015-11-12 Thread David Malcolm

On Sun, 2015-11-01 at 23:44 -0700, Jeff Law wrote:
> On 10/30/2015 06:47 AM, David Malcolm wrote:
> 
> > The typename suggestion seems to be at least somewhat controversial,
> > whereas (I hope) the misspelled field names suggestion is more
> > acceptable.
> >
> > Hence I'm focusing on the field name lookup for now; other uses of the
> > algorithm (e.g. the typename lookup) could be done in followup patches,
> > but I'm deferring them for now in the hope of getting the simplest case
> > into trunk as a first step.  Similarly, for simplicity, I didn't
> > implement any attempt at error-recovery using the hint.
> >
> > The following patch kit is in two parts (for ease of review; they would
> > be applied together):
> >
> >patch 1: Implement Levenshtein distance
> >patch 2: C FE: suggest corrections for misspelled field names
> >
> > I didn't implement a limiter, on the grounds that this only fires
> > once per "has no member named" error, and so is unlikely to slow
> > things down noticeably.
> >
> > Successfully bootstrapped the combination of these two
> > on x86_64-pc-linux-gnu (adds 11 new PASS results to gcc.sum)
> >
> > OK for trunk?
> >
> >   gcc/Makefile.in  |   1 +
> >   gcc/c/c-typeck.c |  70 +++-
> >   gcc/spellcheck.c | 136 
> > +++
> >   gcc/spellcheck.h |  32 ++
> >   gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c |   9 ++
> >   gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c |  64 +++
> >   gcc/testsuite/gcc.dg/plugin/plugin.exp   |   1 +
> >   gcc/testsuite/gcc.dg/spellcheck-fields.c |  63 +++
> >   8 files changed, 375 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/spellcheck.c
> >   create mode 100644 gcc/spellcheck.h
> >   create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c
> >   create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c
> >   create mode 100644 gcc/testsuite/gcc.dg/spellcheck-fields.c
> I'm going to assume you got levenshtein's algorithm reasonably correct.
> 
> This is OK for the trunk.  

Thanks.

FWIW I applied some fixes for the nits identified by Mikael in:
  https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00046.html
renaming params "m" and "n" to "len_s" and "len_t", and fixing the
comment - under the "obvious" rule.

I've committed the combination of the two patches (with the nit fixes)
as r230284; attached is what I committed (for reference).

> Obviously I'd like to see it extend into the 
> other front-ends (C++ in particular).  Then I'd like to see it extend 
> beyond just misspelled field names.

(nods)
>From 7d22e0182f7d21f2b18a64530e7f94dd36cec7b0 Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Thu, 29 Oct 2015 15:29:26 -0400
Subject: [PATCH] Implement Levenshtein distance; use in C FE for
 misspelled field names

This is the combination of:
  [PATCH 1/2] Implement Levenshtein distance
  [PATCH 2/2] C FE: suggest corrections for misspelled field names
plus some nit fixes to spellcheck.c.

gcc/ChangeLog:
	* Makefile.in (OBJS): Add spellcheck.o.
	* spellcheck.c: New file.
	* spellcheck.h: New file.

gcc/c/ChangeLog:
	* c-typeck.c: Include spellcheck.h.
	(lookup_field_fuzzy_find_candidates): New function.
	(lookup_field_fuzzy): New function.
	(build_component_ref): If the field was not found, try using
	lookup_field_fuzzy and potentially offer a suggestion.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/levenshtein-test-1.c: New file.
	* gcc.dg/plugin/levenshtein_plugin.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	levenshtein_plugin.c.
	* gcc.dg/spellcheck-fields.c: New file.
---
 gcc/Makefile.in  |   1 +
 gcc/c/c-typeck.c |  74 +++-
 gcc/spellcheck.c | 136 +++
 gcc/spellcheck.h |  32 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c |   9 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c |  64 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp   |   1 +
 gcc/testsuite/gcc.dg/spellcheck-fields.c |  63 +++
 8 files changed, 379 insertions(+), 1 deletion(-)
 create mode 100644 gcc/spellcheck.c
 create mode 100644 gcc/spellcheck.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-fields.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 34d2356..f17234d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1403,6 +1403,7 @@ OBJS = \
 	shrink-wrap.o \
 	simplify-rtx.o \
 	sparseset.o \
+	spellcheck.o \
 	sreal.o \
 	stack-ptr-mod.o \
 	statistics.o \
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 4335a87..eb4e1fc 100644
--- a/gcc/c/c-typeck.c
+++

1 2 >

1 - 100 of 181 matches

Mail list logo