date:20160311

Re: [RFA][PATCH][PR tree-optimization/64058] Improve and stabilize sorting of coalesce pairs

2016-03-11 Thread Richard Biener

On Fri, Mar 11, 2016 at 4:01 AM, Jeff Law  wrote:
>
> As discussed in the BZ, we have multiple problems with how we sort the
> coalesce list during out-of-ssa coalescing.
>
> First, the sort is not stable.  If the cost of two coalesce pairs is the
> same, we break the tie by looking at the underlying SSA_NAME_VERSION of the
> first, then the second elements in the coalesce pairs.
>
> As a result, changes in SSA_NAME_VERSIONs in the IL can result in different
> coalescing during out-of-ssa.  That in turn can cause changes in what
> objects are coalesced, which in turn causes random performance changes.
>
> This patch addresses that problem by recording an index for each coalescing
> pair discovered and using that index as the final tiebreaker rather than
> looking at SSA_NAME_VERSIONs.  That brings stability to the coalescing
> process and avoids a lot of unnecessary differences in the code we generate
> when SSA_NAME_VERSIONs change.
>
> The second problem is our costing heuristic only looks at edge frequencies &
> flags.  It's actually a pretty good heuristic and captures the main goal of
> coalescing -- reducing the most commonly executed copies.  However, in the
> case where the edge frequencies/flags result in the same cost we can do
> better.
>
> When we coalesce two SSA_NAMEs, we have to build the union of the conflicts
> of each of the SSA_NAMEs -- which means the resulting union object is less
> likely to be able to participate in further coalescing.
>
> So given two coalescing pairs with the same primary cost, preferring the
> coalescing pair with the smaller resulting conflict set gives us a better
> chance that the resulting object will be able to participate in further
> coalescing.
>
> That heuristic broadly mirrors one aspect of how iterated conservative
> coalescing works.  The other interesting heuristic (that I did not
> implement) was to favor coalescing of the pair which had a higher degree of
> common conflicts between the two nodes -- which broadly falls into the same
> category as what we're doing with this patch.  The key being that the
> conflict sets are an important thing to consider when coalescing.
>
> Using the conflict sizes as a tie-breaker eliminates the regression in 64058
> and AFAICT also eliminates the regression in 68654 (the latter doesn't
> include a testcase or as in-depth analysis as 64058, but my testing
> indicates this patch should generate the desired code for both cases).
>
> The patch has (of course) bootstrapped and regression tested on
> x86_64-linux-gnu.
>
> I'd be curious for thoughts on how to build a testcase for this.  I could
> emit the conflict sizes along with the coalescing cost in the dumps, but
> that won't positively verify that we've done the preferred set of
> coalescings.
>
> I might be able to look at the .expand dumps and perhaps look for copies on
> edges.  However, unless the only copies are the ones that were causing the
> regression, I suspect such a test would end up being rather fragile.
>
> Other thoughts on how to get this under regression testing?  And of course,
> thoughts on the patch itself?

Can you please split out the 'index' introduction as a separate patch
and apply that?
I think it is quite obviously a good idea and might make regression
hunting easier
later if needed.

For the other part I noticed a few things
 1) having a bitmap_count_ior_bits () would be an improvement
 2) you might end up with redundant work(?) as you are iterating
 over SSA name coalesce candidates but look at partition conflicts
 3) having this extra heuristic might be best guarded by
flag_expensive_optimizations
 as it is a quite expensive "tie breaker" - maybe even improve things
by first sorting
 after cost and then only doing the tie breaking when necessary, re-sorting the
 sub-sequence with same original cost.  It may also be enough to only perform
 this for "important" candidates, say within the first 100 of the function or so
 or with cost > X.

And finally - if we really think that looking at the conflict size
increase is the way to go
it would maybe be better to use a fibheap updating keys in attempt_coalesce
when we merge the conflicts.  That would also mean to work on a list (fibheap)
of coalesces of partitions rather than SSA names.

I think the patch is reasonable enough for GCC 6 if we can bring compile-time
cost down a bit (it can be quadratic in the number of SSA names if we have
a lot of coalesce candidates and nearly full conflict bitmaps - of course that's
not a case we handle very well right now but still).  I would have hoped the
index part of the patch fixed the regression (by luck)...

As far as a testcase goes we want to scan the dumps for the actual coalesces
being done.  Might be a bit fragile though...

Thanks,
Richard.

> Thanks,
> Jeff
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index cc91e84..f28baa2 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2016-03-10  Jeff Law  
> +
> +   PR tree-

[PATCH] Fix PR70013

2016-03-11 Thread Alan Lawrence

In this PR, a packed structure containing bitfields, loses part of its 
constant-pool initialization in SRA.

A fuller explanation is on the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013#c11. In short we need to
treat constant-pool entries, like function parameters, as both come
'pre-initialized' before the function starts.

Bootstrapped + regtest (gcc, g++) on AArch64; same with addition of Ada, on ARM
and x86.

OK for trunk?

gcc/ChangeLog:

PR tree-optimization/70013
* tree-sra.c (analyze_access_subtree): Also set grp_unscalarized_data
for constant-pool entries.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/sra-20.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/sra-20.c | 20 
 gcc/tree-sra.c |  3 ++-
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-20.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
new file mode 100644
index 000..5002c24
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -Wall" } */
+/* PR/70013, SRA of constant-pool loads removes initialization of part of d.  
*/
+#pragma pack (1)
+struct S0 {
+  unsigned f0 : 17;
+};
+
+int c;
+
+int
+main (int argc, char **argv)
+{
+  struct S0 d[] = { { 1 }, { 2 } };
+  struct S0 e = d[1];
+
+  c = d[0].f0;
+  __builtin_printf ("%x\n", e.f0);
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 72157ed..24eac6a 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2427,7 +2427,8 @@ analyze_access_subtree (struct access *root, struct 
access *parent,
 
   if (!hole || root->grp_total_scalarization)
 root->grp_covered = 1;
-  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
+  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL
+  || constant_decl_p (root->base))
 root->grp_unscalarized_data = 1; /* not covered and written to */
   return sth_created;
 }
-- 
1.9.1

Re: [PATCH 1/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-03-11 Thread Alan Lawrence


On 04/03/16 17:24, Alan Lawrence wrote:

On 26/02/16 14:52, James Greenhalgh wrote:


gcc/ChangeLog:

* gcc/config/aarch64/aarch64.c (aarch64_function_arg_alignment):
Rewrite, looking one level down for records and arrays.
---
  gcc/config/aarch64/aarch64.c | 31 ---
  1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 9142ac0..b084f83 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1925,22 +1925,23 @@ aarch64_vfp_is_call_candidate (cumulative_args_t
pcum_v, machine_mode mode,
  static unsigned int
  aarch64_function_arg_alignment (machine_mode mode, const_tree type)
  {
-  unsigned int alignment;
+  if (!type)
+return GET_MODE_ALIGNMENT (mode);
+  if (integer_zerop (TYPE_SIZE (type)))
+return 0;

-  if (type)
-{
-  if (!integer_zerop (TYPE_SIZE (type)))
-{
-  if (TYPE_MODE (type) == mode)
-alignment = TYPE_ALIGN (type);
-  else
-alignment = GET_MODE_ALIGNMENT (mode);
-}
-  else
-alignment = 0;
-}
-  else
-alignment = GET_MODE_ALIGNMENT (mode);
+  gcc_assert (TYPE_MODE (type) == mode);
+
+  if (!AGGREGATE_TYPE_P (type))
+return TYPE_ALIGN (TYPE_MAIN_VARIANT (type));
+
+  if (TREE_CODE (type) == ARRAY_TYPE)
+return TYPE_ALIGN (TREE_TYPE (type));
+
+  unsigned int alignment = 0;
+
+  for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+alignment = std::max (alignment, DECL_ALIGN (field));

return alignment;
  }




Ping.


[snip]



I'm not proposing to backport these AArch64 changes, hence:

Ping^2.

(For gcc 7 ?)

Also tests https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01073.html .


Ping^3.

Cheers, Alan

[PATCH, PR target/70160] Support uninitialized register uses in STV pass

2016-03-11 Thread Ilya Enkovich

Hi,

This patch is for PR70160.  The problem is that when we build
instructions chain for conversion in STV pass we don't include
instruction using unitialized register value but don't skip
them when convert register.  This patch simply fixes it by
skipping such register uses.  Bootstrapped and tested on
x86_64-pc-linux-gnu {-m32}.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-11  Ilya Enkovich  
Jakub Jelinek  

PR target/70160
* config/i386/i386.c (scalar_chain::convert_reg): Skip uses
of uninitialized values.

gcc/testsuite/

2016-03-11  Ilya Enkovich  

PR target/70160
* gcc.target/i386/pr70160.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fa7d3ff..3d8dbc4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3372,8 +3372,11 @@ scalar_chain::convert_reg (unsigned regno)
bitmap_clear_bit (conv, DF_REF_INSN_UID (ref));
  }
   }
-else if (NONDEBUG_INSN_P (DF_REF_INSN (ref)))
+/* Skip debug insns and uninitialized uses.  */
+else if (DF_REF_CHAIN (ref)
+&& NONDEBUG_INSN_P (DF_REF_INSN (ref)))
   {
+   gcc_assert (scopy);
replace_rtx (DF_REF_INSN (ref), reg, scopy);
df_insn_rescan (DF_REF_INSN (ref));
   }
diff --git a/gcc/testsuite/gcc.target/i386/pr70160.c 
b/gcc/testsuite/gcc.target/i386/pr70160.c
new file mode 100644
index 000..725e955
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70160.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -Wno-uninitialized -Wno-maybe-uninitialized" } */
+
+long long a;
+void fn1();
+void fn2(unsigned t, int a_int, unsigned p)
+{
+  long long x;
+  int i, j = 1;
+  t = i;
+  for (; j;) {
+a = x;
+x = 1 + t;
+j += a_int;
+fn1();
+if (x == 1)
+  return;
+  }
+}

Re: [PATCH, PR target/70160] Support uninitialized register uses in STV pass

2016-03-11 Thread Uros Bizjak

On Fri, Mar 11, 2016 at 12:13 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patch is for PR70160.  The problem is that when we build
> instructions chain for conversion in STV pass we don't include
> instruction using unitialized register value but don't skip
> them when convert register.  This patch simply fixes it by
> skipping such register uses.  Bootstrapped and tested on
> x86_64-pc-linux-gnu {-m32}.  OK for trunk?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2016-03-11  Ilya Enkovich  
> Jakub Jelinek  
>
> PR target/70160
> * config/i386/i386.c (scalar_chain::convert_reg): Skip uses
> of uninitialized values.
>
> gcc/testsuite/
>
> 2016-03-11  Ilya Enkovich  
>
> PR target/70160
> * gcc.target/i386/pr70160.c: New test.

OK.

Thanks,
Uros.

>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index fa7d3ff..3d8dbc4 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3372,8 +3372,11 @@ scalar_chain::convert_reg (unsigned regno)
> bitmap_clear_bit (conv, DF_REF_INSN_UID (ref));
>   }
>}
> -else if (NONDEBUG_INSN_P (DF_REF_INSN (ref)))
> +/* Skip debug insns and uninitialized uses.  */
> +else if (DF_REF_CHAIN (ref)
> +&& NONDEBUG_INSN_P (DF_REF_INSN (ref)))
>{
> +   gcc_assert (scopy);
> replace_rtx (DF_REF_INSN (ref), reg, scopy);
> df_insn_rescan (DF_REF_INSN (ref));
>}
> diff --git a/gcc/testsuite/gcc.target/i386/pr70160.c 
> b/gcc/testsuite/gcc.target/i386/pr70160.c
> new file mode 100644
> index 000..725e955
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr70160.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -msse2 -Wno-uninitialized -Wno-maybe-uninitialized" } */
> +
> +long long a;
> +void fn1();
> +void fn2(unsigned t, int a_int, unsigned p)
> +{
> +  long long x;
> +  int i, j = 1;
> +  t = i;
> +  for (; j;) {
> +a = x;
> +x = 1 + t;
> +j += a_int;
> +fn1();
> +if (x == 1)
> +  return;
> +  }
> +}

[PATCH] Fix ICE in gen_lsm_tmp_name (PR tree-optimization/70169)

2016-03-11 Thread Jakub Jelinek

Hi!

As the testcase shows, we can get a FUNCTION_DECL or LABEL_DECL (on
questionable code).  The patch also removes the default gcc_unreachable (),
as this is just a debugging aid function, nothing bad happens if we ignore
some other tree code.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-11  Jakub Jelinek  

PR tree-optimization/70169
* tree-ssa-loop.c (gen_lsm_tmp_name): Handle FUNCTION_DECL and
LABEL_DECL like VAR_DECL.  Emit nothing instead of gcc_unreachable
for unknown codes.

* gcc.dg/pr70169.c: New test.

--- gcc/tree-ssa-loop.c.jj  2016-02-23 20:43:49.0 +0100
+++ gcc/tree-ssa-loop.c 2016-03-11 10:00:18.792578125 +0100
@@ -769,6 +769,8 @@ gen_lsm_tmp_name (tree ref)
 case SSA_NAME:
 case VAR_DECL:
 case PARM_DECL:
+case FUNCTION_DECL:
+case LABEL_DECL:
   name = get_name (ref);
   if (!name)
name = "D";
@@ -784,11 +786,9 @@ gen_lsm_tmp_name (tree ref)
   break;
 
 case INTEGER_CST:
+default:
   /* Nothing.  */
   break;
-
-default:
-  gcc_unreachable ();
 }
 }
 
--- gcc/testsuite/gcc.dg/pr70169.c.jj   2016-03-11 09:51:59.139304653 +0100
+++ gcc/testsuite/gcc.dg/pr70169.c  2016-03-11 10:02:12.956050092 +0100
@@ -0,0 +1,40 @@
+/* PR tree-optimization/70169 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-strict-aliasing -fno-tree-dce" } */
+
+int printf (const char *, ...); 
+
+void
+foo ()
+{
+  unsigned char *p = (unsigned char *) &printf;
+  for (;;)
+(*p)++;
+}
+
+void
+bar (int x)
+{
+  unsigned char *p = (unsigned char *) &printf;
+  int i;
+  for (i = 0; i < x; i++)
+(*p)++;
+}
+
+void
+baz (int x, int y)
+{
+  unsigned char *p = (unsigned char *) &&lab;
+  int i;
+  if (y)
+{
+  for (i = 0; i < x; i++)
+   (*p)++;
+}
+  else
+{
+ lab:
+  asm volatile ("");
+  foo ();
+}
+}

Jakub

Re: [PATCH] Fix ICE in gen_lsm_tmp_name (PR tree-optimization/70169)

2016-03-11 Thread Richard Biener

On Fri, 11 Mar 2016, Jakub Jelinek wrote:

> Hi!
> 
> As the testcase shows, we can get a FUNCTION_DECL or LABEL_DECL (on
> questionable code).  The patch also removes the default gcc_unreachable (),
> as this is just a debugging aid function, nothing bad happens if we ignore
> some other tree code.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2016-03-11  Jakub Jelinek  
> 
>   PR tree-optimization/70169
>   * tree-ssa-loop.c (gen_lsm_tmp_name): Handle FUNCTION_DECL and
>   LABEL_DECL like VAR_DECL.  Emit nothing instead of gcc_unreachable
>   for unknown codes.
> 
>   * gcc.dg/pr70169.c: New test.
> 
> --- gcc/tree-ssa-loop.c.jj2016-02-23 20:43:49.0 +0100
> +++ gcc/tree-ssa-loop.c   2016-03-11 10:00:18.792578125 +0100
> @@ -769,6 +769,8 @@ gen_lsm_tmp_name (tree ref)
>  case SSA_NAME:
>  case VAR_DECL:
>  case PARM_DECL:
> +case FUNCTION_DECL:
> +case LABEL_DECL:
>name = get_name (ref);
>if (!name)
>   name = "D";
> @@ -784,11 +786,9 @@ gen_lsm_tmp_name (tree ref)
>break;
>  
>  case INTEGER_CST:
> +default:
>/* Nothing.  */
>break;
> -
> -default:
> -  gcc_unreachable ();
>  }
>  }
>  
> --- gcc/testsuite/gcc.dg/pr70169.c.jj 2016-03-11 09:51:59.139304653 +0100
> +++ gcc/testsuite/gcc.dg/pr70169.c2016-03-11 10:02:12.956050092 +0100
> @@ -0,0 +1,40 @@
> +/* PR tree-optimization/70169 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-strict-aliasing -fno-tree-dce" } */
> +
> +int printf (const char *, ...); 
> +
> +void
> +foo ()
> +{
> +  unsigned char *p = (unsigned char *) &printf;
> +  for (;;)
> +(*p)++;
> +}
> +
> +void
> +bar (int x)
> +{
> +  unsigned char *p = (unsigned char *) &printf;
> +  int i;
> +  for (i = 0; i < x; i++)
> +(*p)++;
> +}
> +
> +void
> +baz (int x, int y)
> +{
> +  unsigned char *p = (unsigned char *) &&lab;
> +  int i;
> +  if (y)
> +{
> +  for (i = 0; i < x; i++)
> + (*p)++;
> +}
> +  else
> +{
> + lab:
> +  asm volatile ("");
> +  foo ();
> +}
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] Fix ICE during insv expansion (PR rtl-optimization/70174)

2016-03-11 Thread Jakub Jelinek

Hi!

The following testcase ICEs on i?86/x86_64 and aarch64, because gen_lowpart
(pointer to gen_lowpart_general at that spot) doesn't want to handle
SUBREG of SYMBOL_REF.  Fixed by using a variant that doesn't ICE and forcing
the operand into a register if it can't be optimized without generating
instructions.  This is similar to the case a few lines above, where we also
force_reg if simplify_subreg failed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-11  Jakub Jelinek  

PR rtl-optimization/70174
* expmed.c (store_bit_field_using_insv): Use gen_lowpart_if_possible
followed by gen_lowpart on force_reg instead of just gen_lowpart.

* gcc.dg/pr70174.c: New test.

--- gcc/expmed.c.jj 2016-02-11 20:28:51.0 +0100
+++ gcc/expmed.c2016-03-11 10:40:27.258719168 +0100
@@ -658,24 +658,28 @@ store_bit_field_using_insv (const extrac
 {
   if (GET_MODE_BITSIZE (GET_MODE (value)) >= bitsize)
{
+ rtx tmp;
  /* Optimization: Don't bother really extending VALUE
 if it has all the bits we will actually use.  However,
 if we must narrow it, be sure we do it correctly.  */
 
  if (GET_MODE_SIZE (GET_MODE (value)) < GET_MODE_SIZE (op_mode))
{
- rtx tmp;
-
  tmp = simplify_subreg (op_mode, value1, GET_MODE (value), 0);
  if (! tmp)
tmp = simplify_gen_subreg (op_mode,
   force_reg (GET_MODE (value),
  value1),
   GET_MODE (value), 0);
- value1 = tmp;
}
  else
-   value1 = gen_lowpart (op_mode, value1);
+   {
+ tmp = gen_lowpart_if_possible (op_mode, value1);
+ if (! tmp)
+   tmp = gen_lowpart (op_mode, force_reg (GET_MODE (value),
+  value1));
+   }
+ value1 = tmp;
}
   else if (CONST_INT_P (value))
value1 = gen_int_mode (INTVAL (value), op_mode);
--- gcc/testsuite/gcc.dg/pr70174.c.jj   2016-03-11 10:42:45.914894656 +0100
+++ gcc/testsuite/gcc.dg/pr70174.c  2016-03-11 10:42:31.0 +0100
@@ -0,0 +1,11 @@
+/* PR rtl-optimization/70174 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct S { int f : 4; } a;
+  
+void
+foo (void)
+{ 
+  a.f = foo;   /* { dg-warning "assignment makes integer from pointer without 
a cast" } */
+}

Jakub

Re: [PATCH] Fix PR70013

2016-03-11 Thread Richard Biener

On Fri, Mar 11, 2016 at 11:14 AM, Alan Lawrence  wrote:
> In this PR, a packed structure containing bitfields, loses part of its 
> constant-pool initialization in SRA.
>
> A fuller explanation is on the PR:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013#c11. In short we need to
> treat constant-pool entries, like function parameters, as both come
> 'pre-initialized' before the function starts.
>
> Bootstrapped + regtest (gcc, g++) on AArch64; same with addition of Ada, on 
> ARM
> and x86.
>
> OK for trunk?

Ok.

Richard.

> gcc/ChangeLog:
>
> PR tree-optimization/70013
> * tree-sra.c (analyze_access_subtree): Also set grp_unscalarized_data
> for constant-pool entries.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/sra-20.c: New.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/sra-20.c | 20 
>  gcc/tree-sra.c |  3 ++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
> new file mode 100644
> index 000..5002c24
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -Wall" } */
> +/* PR/70013, SRA of constant-pool loads removes initialization of part of d. 
>  */
> +#pragma pack (1)
> +struct S0 {
> +  unsigned f0 : 17;
> +};
> +
> +int c;
> +
> +int
> +main (int argc, char **argv)
> +{
> +  struct S0 d[] = { { 1 }, { 2 } };
> +  struct S0 e = d[1];
> +
> +  c = d[0].f0;
> +  __builtin_printf ("%x\n", e.f0);
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 72157ed..24eac6a 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2427,7 +2427,8 @@ analyze_access_subtree (struct access *root, struct 
> access *parent,
>
>if (!hole || root->grp_total_scalarization)
>  root->grp_covered = 1;
> -  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
> +  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL
> +  || constant_decl_p (root->base))
>  root->grp_unscalarized_data = 1; /* not covered and written to */
>return sth_created;
>  }
> --
> 1.9.1
>

Re: [PATCH] Fix ICE during insv expansion (PR rtl-optimization/70174)

2016-03-11 Thread Richard Biener

On Fri, 11 Mar 2016, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on i?86/x86_64 and aarch64, because gen_lowpart
> (pointer to gen_lowpart_general at that spot) doesn't want to handle
> SUBREG of SYMBOL_REF.  Fixed by using a variant that doesn't ICE and forcing
> the operand into a register if it can't be optimized without generating
> instructions.  This is similar to the case a few lines above, where we also
> force_reg if simplify_subreg failed.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2016-03-11  Jakub Jelinek  
> 
>   PR rtl-optimization/70174
>   * expmed.c (store_bit_field_using_insv): Use gen_lowpart_if_possible
>   followed by gen_lowpart on force_reg instead of just gen_lowpart.
> 
>   * gcc.dg/pr70174.c: New test.
> 
> --- gcc/expmed.c.jj   2016-02-11 20:28:51.0 +0100
> +++ gcc/expmed.c  2016-03-11 10:40:27.258719168 +0100
> @@ -658,24 +658,28 @@ store_bit_field_using_insv (const extrac
>  {
>if (GET_MODE_BITSIZE (GET_MODE (value)) >= bitsize)
>   {
> +   rtx tmp;
> /* Optimization: Don't bother really extending VALUE
>if it has all the bits we will actually use.  However,
>if we must narrow it, be sure we do it correctly.  */
>  
> if (GET_MODE_SIZE (GET_MODE (value)) < GET_MODE_SIZE (op_mode))
>   {
> -   rtx tmp;
> -
> tmp = simplify_subreg (op_mode, value1, GET_MODE (value), 0);
> if (! tmp)
>   tmp = simplify_gen_subreg (op_mode,
>  force_reg (GET_MODE (value),
> value1),
>  GET_MODE (value), 0);
> -   value1 = tmp;
>   }
> else
> - value1 = gen_lowpart (op_mode, value1);
> + {
> +   tmp = gen_lowpart_if_possible (op_mode, value1);
> +   if (! tmp)
> + tmp = gen_lowpart (op_mode, force_reg (GET_MODE (value),
> +value1));
> + }
> +   value1 = tmp;
>   }
>else if (CONST_INT_P (value))
>   value1 = gen_int_mode (INTVAL (value), op_mode);
> --- gcc/testsuite/gcc.dg/pr70174.c.jj 2016-03-11 10:42:45.914894656 +0100
> +++ gcc/testsuite/gcc.dg/pr70174.c2016-03-11 10:42:31.0 +0100
> @@ -0,0 +1,11 @@
> +/* PR rtl-optimization/70174 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +struct S { int f : 4; } a;
> +  
> +void
> +foo (void)
> +{ 
> +  a.f = foo; /* { dg-warning "assignment makes integer from pointer without 
> a cast" } */
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] Fix ICE during niter computation (PR tree-optimization/70177)

2016-03-11 Thread Jakub Jelinek

Hi!

On the following testcase we ICE, because we call extract_ops_from_tree
on COND_EXPR, and that inline asserts it doesn't have 3 operands.  
derive_constant_upper_bound_ops has a big switch on various tree codes,
but doesn't handle any 3 argument ones right now, so there is no need
to pass the extra argument there, but we just shouldn't ICE on it.

While at it, I've renamed extract_ops_from_tree_1 to
extract_ops_from_tree so that we can use C++ function overloading on
something where it makes lots of sense.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-11  Jakub Jelinek  

PR tree-optimization/70177
* gimple-expr.h (extract_ops_from_tree_1): Renamed to ...
(extract_ops_from_tree): ... this.  In the 2 argument
overload remove _1 suffix.
* gimple-expr.c (extract_ops_from_tree_1): Renamed to ...
(extract_ops_from_tree): ... this.
* gimple.c (gimple_build_assign, gimple_assign_set_rhs_from_tree):
Adjust callers.
* tree-ssa-loop-niter.c (derive_constant_upper_bound): Likewise.
* tree-ssa-forwprop.c (defcodefor_name): Call 3 operand
extract_ops_from_tree instead of 2 operand one.

* gcc.dg/pr70177.c: New test.

--- gcc/gimple-expr.h.jj2016-01-04 14:55:50.0 +0100
+++ gcc/gimple-expr.h   2016-03-11 10:53:22.867513568 +0100
@@ -35,8 +35,8 @@ extern tree create_tmp_reg (tree, const
 extern tree create_tmp_reg_fn (struct function *, tree, const char *);
 
 
-extern void extract_ops_from_tree_1 (tree, enum tree_code *, tree *, tree *,
-tree *);
+extern void extract_ops_from_tree (tree, enum tree_code *, tree *, tree *,
+  tree *);
 extern void gimple_cond_get_ops_from_tree (tree, enum tree_code *, tree *,
   tree *);
 extern bool is_gimple_lvalue (tree);
@@ -146,15 +146,15 @@ is_gimple_constant (const_tree t)
 }
 }
 
-/* A wrapper around extract_ops_from_tree_1, for callers which expect
-   to see only a maximum of two operands.  */
+/* A wrapper around extract_ops_from_tree with 3 ops, for callers which
+   expect to see only a maximum of two operands.  */
 
 static inline void
 extract_ops_from_tree (tree expr, enum tree_code *code, tree *op0,
   tree *op1)
 {
   tree op2;
-  extract_ops_from_tree_1 (expr, code, op0, op1, &op2);
+  extract_ops_from_tree (expr, code, op0, op1, &op2);
   gcc_assert (op2 == NULL_TREE);
 }
 
--- gcc/gimple-expr.c.jj2016-01-07 09:45:20.0 +0100
+++ gcc/gimple-expr.c   2016-03-11 10:53:38.965302034 +0100
@@ -519,8 +519,8 @@ create_tmp_reg_fn (struct function *fn,
*OP1_P, *OP2_P and *OP3_P respectively.  */
 
 void
-extract_ops_from_tree_1 (tree expr, enum tree_code *subcode_p, tree *op1_p,
-tree *op2_p, tree *op3_p)
+extract_ops_from_tree (tree expr, enum tree_code *subcode_p, tree *op1_p,
+  tree *op2_p, tree *op3_p)
 {
   enum gimple_rhs_class grhs_class;
 
--- gcc/gimple.c.jj 2016-01-19 13:31:09.0 +0100
+++ gcc/gimple.c2016-03-11 10:52:36.124128366 +0100
@@ -387,7 +387,7 @@ gimple_build_assign (tree lhs, tree rhs
   enum tree_code subcode;
   tree op1, op2, op3;
 
-  extract_ops_from_tree_1 (rhs, &subcode, &op1, &op2, &op3);
+  extract_ops_from_tree (rhs, &subcode, &op1, &op2, &op3);
   return gimple_build_assign (lhs, subcode, op1, op2, op3 PASS_MEM_STAT);
 }
 
@@ -1578,7 +1578,7 @@ gimple_assign_set_rhs_from_tree (gimple_
   enum tree_code subcode;
   tree op1, op2, op3;
 
-  extract_ops_from_tree_1 (expr, &subcode, &op1, &op2, &op3);
+  extract_ops_from_tree (expr, &subcode, &op1, &op2, &op3);
   gimple_assign_set_rhs_with_ops (gsi, subcode, op1, op2, op3);
 }
 
--- gcc/tree-ssa-forwprop.c.jj  2016-01-04 14:55:52.0 +0100
+++ gcc/tree-ssa-forwprop.c 2016-03-11 10:53:54.961091841 +0100
@@ -1477,7 +1477,7 @@ defcodefor_name (tree name, enum tree_co
   || GIMPLE_BINARY_RHS
   || GIMPLE_UNARY_RHS
   || GIMPLE_SINGLE_RHS)
-extract_ops_from_tree_1 (name, &code1, &arg11, &arg21, &arg31);
+extract_ops_from_tree (name, &code1, &arg11, &arg21, &arg31);
 
   *code = code1;
   *arg1 = arg11;
--- gcc/tree-ssa-loop-niter.c.jj2016-02-24 14:52:16.0 +0100
+++ gcc/tree-ssa-loop-niter.c   2016-03-11 10:54:38.463520194 +0100
@@ -2742,9 +2742,9 @@ static widest_int
 derive_constant_upper_bound (tree val)
 {
   enum tree_code code;
-  tree op0, op1;
+  tree op0, op1, op2;
 
-  extract_ops_from_tree (val, &code, &op0, &op1);
+  extract_ops_from_tree (val, &code, &op0, &op1, &op2);
   return derive_constant_upper_bound_ops (TREE_TYPE (val), op0, code, op1);
 }
 
--- gcc/testsuite/gcc.dg/pr70177.c.jj   2016-03-11 10:58:49.211225229 +0100
+++ gcc/testsuite/gcc.dg/pr70177.c  2016-03-11 10:58:40.0 +0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/70177 */
+/* { dg-do compile } */
+/* {

Re: [PATCH] Add -funconstrained-commons to work around PR/69368 (and others) in SPEC2006

2016-03-11 Thread Alan Lawrence

On 10/03/16 16:18, Dominique d'Humières wrote:

> The test gfortran.dg/unconstrained_commons.f fails in the 32 bit mode. It
> needs some regexp

Indeed, confirmed on ARM, sorry for not spotting this earlier.

I believe the variable, if there is one, should always be called 'j', as it is
in the source. So how about this, tested on ARM, AArch64, x86_64?

gcc/testsuite/ChangeLog:

* gfortran.dg/unconstrained_commons.f: Widen regexp to match j_
---
 gcc/testsuite/gfortran.dg/unconstrained_commons.f | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gfortran.dg/unconstrained_commons.f 
b/gcc/testsuite/gfortran.dg/unconstrained_commons.f
index f9fc471..bee67ab 100644
--- a/gcc/testsuite/gfortran.dg/unconstrained_commons.f
+++ b/gcc/testsuite/gfortran.dg/unconstrained_commons.f
@@ -17,4 +17,4 @@
 ! { dg-final { scan-tree-dump-not "FIND" "dom2" } }
 ! We should retain both a read and write of mycommon.x.
 ! { dg-final { scan-tree-dump-times "  _\[0-9\]+ = 
mycommon\\.x\\\[_\[0-9\]+\\\];" 1 "dom2" } }
-! { dg-final { scan-tree-dump-times "  mycommon\\.x\\\[_\[0-9\]+\\\] = 
_\[0-9\]+;" 1 "dom2" } }
+! { dg-final { scan-tree-dump-times "  mycommon\\.x\\\[j?_\[0-9\]+\\\] = 
_\[0-9\]+;" 1 "dom2" } }
-- 
1.9.1

Re: [PATCH] Add -funconstrained-commons to work around PR/69368 (and others) in SPEC2006

2016-03-11 Thread Jakub Jelinek

On Fri, Mar 11, 2016 at 12:11:42PM +, Alan Lawrence wrote:
> On 10/03/16 16:18, Dominique d'Humières wrote:
> 
> > The test gfortran.dg/unconstrained_commons.f fails in the 32 bit mode. It
> > needs some regexp
> 
> Indeed, confirmed on ARM, sorry for not spotting this earlier.
> 
> I believe the variable, if there is one, should always be called 'j', as it is
> in the source. So how about this, tested on ARM, AArch64, x86_64?
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/unconstrained_commons.f: Widen regexp to match j_

Ok.

>  gcc/testsuite/gfortran.dg/unconstrained_commons.f | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gfortran.dg/unconstrained_commons.f 
> b/gcc/testsuite/gfortran.dg/unconstrained_commons.f
> index f9fc471..bee67ab 100644
> --- a/gcc/testsuite/gfortran.dg/unconstrained_commons.f
> +++ b/gcc/testsuite/gfortran.dg/unconstrained_commons.f
> @@ -17,4 +17,4 @@
>  ! { dg-final { scan-tree-dump-not "FIND" "dom2" } }
>  ! We should retain both a read and write of mycommon.x.
>  ! { dg-final { scan-tree-dump-times "  _\[0-9\]+ = 
> mycommon\\.x\\\[_\[0-9\]+\\\];" 1 "dom2" } }
> -! { dg-final { scan-tree-dump-times "  mycommon\\.x\\\[_\[0-9\]+\\\] = 
> _\[0-9\]+;" 1 "dom2" } }
> +! { dg-final { scan-tree-dump-times "  mycommon\\.x\\\[j?_\[0-9\]+\\\] = 
> _\[0-9\]+;" 1 "dom2" } }

Jakub

Re: [PATCH] Fix ICE during niter computation (PR tree-optimization/70177)

2016-03-11 Thread Richard Biener

On Fri, 11 Mar 2016, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase we ICE, because we call extract_ops_from_tree
> on COND_EXPR, and that inline asserts it doesn't have 3 operands.  
> derive_constant_upper_bound_ops has a big switch on various tree codes,
> but doesn't handle any 3 argument ones right now, so there is no need
> to pass the extra argument there, but we just shouldn't ICE on it.
> 
> While at it, I've renamed extract_ops_from_tree_1 to
> extract_ops_from_tree so that we can use C++ function overloading on
> something where it makes lots of sense.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2016-03-11  Jakub Jelinek  
> 
>   PR tree-optimization/70177
>   * gimple-expr.h (extract_ops_from_tree_1): Renamed to ...
>   (extract_ops_from_tree): ... this.  In the 2 argument
>   overload remove _1 suffix.
>   * gimple-expr.c (extract_ops_from_tree_1): Renamed to ...
>   (extract_ops_from_tree): ... this.
>   * gimple.c (gimple_build_assign, gimple_assign_set_rhs_from_tree):
>   Adjust callers.
>   * tree-ssa-loop-niter.c (derive_constant_upper_bound): Likewise.
>   * tree-ssa-forwprop.c (defcodefor_name): Call 3 operand
>   extract_ops_from_tree instead of 2 operand one.
> 
>   * gcc.dg/pr70177.c: New test.
> 
> --- gcc/gimple-expr.h.jj  2016-01-04 14:55:50.0 +0100
> +++ gcc/gimple-expr.h 2016-03-11 10:53:22.867513568 +0100
> @@ -35,8 +35,8 @@ extern tree create_tmp_reg (tree, const
>  extern tree create_tmp_reg_fn (struct function *, tree, const char *);
>  
>  
> -extern void extract_ops_from_tree_1 (tree, enum tree_code *, tree *, tree *,
> -  tree *);
> +extern void extract_ops_from_tree (tree, enum tree_code *, tree *, tree *,
> +tree *);
>  extern void gimple_cond_get_ops_from_tree (tree, enum tree_code *, tree *,
>  tree *);
>  extern bool is_gimple_lvalue (tree);
> @@ -146,15 +146,15 @@ is_gimple_constant (const_tree t)
>  }
>  }
>  
> -/* A wrapper around extract_ops_from_tree_1, for callers which expect
> -   to see only a maximum of two operands.  */
> +/* A wrapper around extract_ops_from_tree with 3 ops, for callers which
> +   expect to see only a maximum of two operands.  */
>  
>  static inline void
>  extract_ops_from_tree (tree expr, enum tree_code *code, tree *op0,
>  tree *op1)
>  {
>tree op2;
> -  extract_ops_from_tree_1 (expr, code, op0, op1, &op2);
> +  extract_ops_from_tree (expr, code, op0, op1, &op2);
>gcc_assert (op2 == NULL_TREE);
>  }
>  
> --- gcc/gimple-expr.c.jj  2016-01-07 09:45:20.0 +0100
> +++ gcc/gimple-expr.c 2016-03-11 10:53:38.965302034 +0100
> @@ -519,8 +519,8 @@ create_tmp_reg_fn (struct function *fn,
> *OP1_P, *OP2_P and *OP3_P respectively.  */
>  
>  void
> -extract_ops_from_tree_1 (tree expr, enum tree_code *subcode_p, tree *op1_p,
> -  tree *op2_p, tree *op3_p)
> +extract_ops_from_tree (tree expr, enum tree_code *subcode_p, tree *op1_p,
> +tree *op2_p, tree *op3_p)
>  {
>enum gimple_rhs_class grhs_class;
>  
> --- gcc/gimple.c.jj   2016-01-19 13:31:09.0 +0100
> +++ gcc/gimple.c  2016-03-11 10:52:36.124128366 +0100
> @@ -387,7 +387,7 @@ gimple_build_assign (tree lhs, tree rhs
>enum tree_code subcode;
>tree op1, op2, op3;
>  
> -  extract_ops_from_tree_1 (rhs, &subcode, &op1, &op2, &op3);
> +  extract_ops_from_tree (rhs, &subcode, &op1, &op2, &op3);
>return gimple_build_assign (lhs, subcode, op1, op2, op3 PASS_MEM_STAT);
>  }
>  
> @@ -1578,7 +1578,7 @@ gimple_assign_set_rhs_from_tree (gimple_
>enum tree_code subcode;
>tree op1, op2, op3;
>  
> -  extract_ops_from_tree_1 (expr, &subcode, &op1, &op2, &op3);
> +  extract_ops_from_tree (expr, &subcode, &op1, &op2, &op3);
>gimple_assign_set_rhs_with_ops (gsi, subcode, op1, op2, op3);
>  }
>  
> --- gcc/tree-ssa-forwprop.c.jj2016-01-04 14:55:52.0 +0100
> +++ gcc/tree-ssa-forwprop.c   2016-03-11 10:53:54.961091841 +0100
> @@ -1477,7 +1477,7 @@ defcodefor_name (tree name, enum tree_co
>  || GIMPLE_BINARY_RHS
>  || GIMPLE_UNARY_RHS
>  || GIMPLE_SINGLE_RHS)
> -extract_ops_from_tree_1 (name, &code1, &arg11, &arg21, &arg31);
> +extract_ops_from_tree (name, &code1, &arg11, &arg21, &arg31);
>  
>*code = code1;
>*arg1 = arg11;
> --- gcc/tree-ssa-loop-niter.c.jj  2016-02-24 14:52:16.0 +0100
> +++ gcc/tree-ssa-loop-niter.c 2016-03-11 10:54:38.463520194 +0100
> @@ -2742,9 +2742,9 @@ static widest_int
>  derive_constant_upper_bound (tree val)
>  {
>enum tree_code code;
> -  tree op0, op1;
> +  tree op0, op1, op2;
>  
> -  extract_ops_from_tree (val, &code, &op0, &op1);
> +  extract_ops_from_tree (val, &code, &op0, &op1, &op2);
>return derive_constant_upper_bound_ops (TREE_TYPE (val), op0, code, op1);

[PATCH][GCC 7] Fix PR70171

2016-03-11 Thread Richard Biener


The following teaches phiprop to handle the case of aggregate copies
where the aggregate has non-BLKmode which means it is very likely
expanded as reg-reg moves (any better test for that apart from
checking for non-BLKmode?).  This improves code for the testcase
from

_Z14struct_ternary1SS_b:
.LFB2:
.cfi_startproc
leaq-40(%rsp), %rcx
leaq-24(%rsp), %rax
testb   %dl, %dl
movl%edi, -24(%rsp)
movl%esi, -40(%rsp)
cmove   %rcx, %rax
movl(%rax), %eax
ret

to

_Z14struct_ternary1SS_b:
.LFB2:
.cfi_startproc
testb   %dl, %dl
movl%edi, %eax
cmove   %esi, %eax
ret

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.

Richard.

2016-03-11  Richard Biener  

PR tree-optimization/70171
* tree-ssa-phiprop.c: Include stor-layout.h.
(phiprop_insert_phi): Handle the aggregate copy case.
(propagate_with_phi): Likewise.

* g++.dg/tree-ssa/pr70171.C: New testcase.

Index: gcc/tree-ssa-phiprop.c
===
*** gcc/tree-ssa-phiprop.c  (revision 234134)
--- gcc/tree-ssa-phiprop.c  (working copy)
*** along with GCC; see the file COPYING3.
*** 31,36 
--- 31,37 
  #include "tree-eh.h"
  #include "gimplify.h"
  #include "gimple-iterator.h"
+ #include "stor-layout.h"
  
  /* This pass propagates indirect loads through the PHI node for its
 address to make the load source possibly non-addressable and to
*** phiprop_insert_phi (basic_block bb, gphi
*** 132,138 
struct phiprop_d *phivn, size_t n)
  {
tree res;
!   gphi *new_phi;
edge_iterator ei;
edge e;
  
--- 133,139 
struct phiprop_d *phivn, size_t n)
  {
tree res;
!   gphi *new_phi = NULL;
edge_iterator ei;
edge e;
  
*** phiprop_insert_phi (basic_block bb, gphi
*** 142,148 
/* Build a new PHI node to replace the definition of
   the indirect reference lhs.  */
res = gimple_assign_lhs (use_stmt);
!   new_phi = create_phi_node (res, bb);
  
if (dump_file && (dump_flags & TDF_DETAILS))
  {
--- 143,150 
/* Build a new PHI node to replace the definition of
   the indirect reference lhs.  */
res = gimple_assign_lhs (use_stmt);
!   if (TREE_CODE (res) == SSA_NAME)
! new_phi = create_phi_node (res, bb);
  
if (dump_file && (dump_flags & TDF_DETAILS))
  {
*** phiprop_insert_phi (basic_block bb, gphi
*** 187,193 
{
  tree rhs = gimple_assign_rhs1 (use_stmt);
  gcc_assert (TREE_CODE (old_arg) == ADDR_EXPR);
! new_var = make_ssa_name (TREE_TYPE (rhs));
  if (!is_gimple_min_invariant (old_arg))
old_arg = PHI_ARG_DEF_FROM_EDGE (phi, e);
  else
--- 189,198 
{
  tree rhs = gimple_assign_rhs1 (use_stmt);
  gcc_assert (TREE_CODE (old_arg) == ADDR_EXPR);
! if (TREE_CODE (res) == SSA_NAME)
!   new_var = make_ssa_name (TREE_TYPE (rhs));
! else
!   new_var = unshare_expr (res);
  if (!is_gimple_min_invariant (old_arg))
old_arg = PHI_ARG_DEF_FROM_EDGE (phi, e);
  else
*** phiprop_insert_phi (basic_block bb, gphi
*** 210,222 
}
}
  
!   add_phi_arg (new_phi, new_var, e, locus);
  }
  
!   update_stmt (new_phi);
  
!   if (dump_file && (dump_flags & TDF_DETAILS))
! print_gimple_stmt (dump_file, new_phi, 0, 0);
  
return res;
  }
--- 215,231 
}
}
  
!   if (new_phi)
!   add_phi_arg (new_phi, new_var, e, locus);
  }
  
!   if (new_phi)
! {
!   update_stmt (new_phi);
  
!   if (dump_file && (dump_flags & TDF_DETAILS))
!   print_gimple_stmt (dump_file, new_phi, 0, 0);
! }
  
return res;
  }
*** propagate_with_phi (basic_block bb, gphi
*** 250,256 
tree type = NULL_TREE;
  
if (!POINTER_TYPE_P (TREE_TYPE (ptr))
!   || !is_gimple_reg_type (TREE_TYPE (TREE_TYPE (ptr
  return false;
  
/* Check if we can "cheaply" dereference all phi arguments.  */
--- 259,266 
tree type = NULL_TREE;
  
if (!POINTER_TYPE_P (TREE_TYPE (ptr))
!   || (!is_gimple_reg_type (TREE_TYPE (TREE_TYPE (ptr)))
! && TYPE_MODE (TREE_TYPE (TREE_TYPE (ptr))) == BLKmode))
  return false;
  
/* Check if we can "cheaply" dereference all phi arguments.  */
*** propagate_with_phi (basic_block bb, gphi
*** 306,312 
   
/* Check whether this is a load of *ptr.  */
if (!(is_gimple_assign (use_stmt)
-   && TREE_CODE (gimple_assign_lhs (use_stmt)) == SSA_NAME
&& gimple_assign_rhs_code (use_stmt) == MEM_REF
&& TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) == ptr
&& integer_zerop (TREE_OPERAND (gimple_assign_

Re: [C++ PATCH] Reuse certain cxx_eval_constant_expression results in cxx_eval_vec_init_1 (PR c++/70001)

2016-03-11 Thread Jason Merrill


On 03/10/2016 01:39 PM, Jakub Jelinek wrote:

+  /* Don't reuse the result of cxx_eval_constant_expression
+call if it isn't a constant initializer or if it requires
+relocations.  */


Let's phrase this positively ("Reuse the result if...").


+ if (new_ctx.ctor != ctx->ctor)
+   eltinit = new_ctx.ctor;
+ for (i = 1; i < max; ++i)
+   {
+ idx = build_int_cst (size_type_node, i);
+ CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
+   }


This is going to use the same CONSTRUCTOR for all the elements, which 
will lead to problems if we then store into a subobject of one of the 
elements and see that reflected in all the others as well.  We need to 
unshare_expr when reusing the initializer.


Jason

Re: [PATCH] Fix PR c++/70106 (type of parenthesized qualified-id has wrong cv-qualifiers)

2016-03-11 Thread Jason Merrill


OK.

Jason

Re: [C++ PATCH] Reuse certain cxx_eval_constant_expression results in cxx_eval_vec_init_1 (PR c++/70001)

2016-03-11 Thread Jakub Jelinek

On Fri, Mar 11, 2016 at 09:27:54AM -0500, Jason Merrill wrote:
> On 03/10/2016 01:39 PM, Jakub Jelinek wrote:
> >+  /* Don't reuse the result of cxx_eval_constant_expression
> >+ call if it isn't a constant initializer or if it requires
> >+ relocations.  */
> 
> Let's phrase this positively ("Reuse the result if...").
> 
> >+  if (new_ctx.ctor != ctx->ctor)
> >+eltinit = new_ctx.ctor;
> >+  for (i = 1; i < max; ++i)
> >+{
> >+  idx = build_int_cst (size_type_node, i);
> >+  CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
> >+}
> 
> This is going to use the same CONSTRUCTOR for all the elements, which will
> lead to problems if we then store into a subobject of one of the elements
> and see that reflected in all the others as well.  We need to unshare_expr
> when reusing the initializer.

Well, but then even what has been already committed is unsafe,
initializer_constant_valid_p can return null_pointer_node even on
CONSTRUCTOR, or CONSTRUCTOR holding CONSTRUCTORs etc.
So, either we need to unshare_expr it in every case, so
CONSTRUCTOR_APPEND_ELT (*p, idx, unshare_expr (eltinit));
or alternatively we could use some flag on CONSTRUCTOR to mark (possibly)
shared ctors and unshare them upon constexpr store to them, or
unshare whenever we store.

What would be a testcase for the unsharing?
Following still works:

// PR c++/70001
// { dg-do compile { target c++14 } }

struct B
{
  int a;
  constexpr B () : a (0) { }
  constexpr B (int x) : a (x) { }
};
struct C
{
  B c;
  constexpr C () : c (0) { }
};
struct A
{
  B b[1 << 4];
};
struct D
{
  C d[1 << 4];
};

constexpr int
foo (int a, int b)
{
  A c;
  c.b[a].a += b;
  c.b[b].a += a;
  return c.b[0].a + c.b[a].a + c.b[b].a;
}

constexpr int d = foo (1, 2);
constexpr int e = foo (0, 3);
constexpr int f = foo (2, 4);
static_assert (d == 3 && e == 6 && f == 6, "");

constexpr int
bar (int a, int b)
{
  D c;
  c.d[a].c.a += b;
  c.d[b].c.a += a;
  return c.d[0].c.a + c.d[a].c.a + c.d[b].c.a;
}

constexpr int g = bar (1, 2);
constexpr int h = bar (0, 3);
constexpr int i = bar (2, 4);
static_assert (g == 3 && h == 6 && i == 6, "");

Jakub

[PATCH, PR70045] Unshare create_empty_if_region_on_edge argument

2016-03-11 Thread Tom de Vries


Hi,

this patch fixes PR70045, a graphite 6 regression.

The problem is as follows: in graphite_create_new_loop_guard, a 
condition cond_expr is constructed using an upper bound expression *ub.


During the call:
...
exit_edge = create_empty_if_region_on_edge (entry_edge, cond_expr);
...
the cond_expr is modified in place, which has as side-effect that *ub is 
modified.


The patch fixes this by unsharing the cond_expr before passing it as 
argument to create_empty_if_region_on_edge.


Bootstrapped and reg-tested on x86_64.

OK for stage4 trunk?

Thanks,
- Tom
Unshare create_empty_if_region_on_edge argument

2016-03-11  Tom de Vries  

	PR tree-optimization/70045
	* graphite-isl-ast-to-gimple.c (graphite_create_new_loop_guard): Unshare
	create_empty_if_region_on_edge argument.

	* gcc.dg/graphite/pr70045.c: New test.

---
 gcc/graphite-isl-ast-to-gimple.c|  3 ++-
 gcc/testsuite/gcc.dg/graphite/pr70045.c | 28 
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 89a4118..8dd5dc8 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -821,7 +821,8 @@ graphite_create_new_loop_guard (edge entry_edge,
   if (integer_onep (cond_expr))
 exit_edge = entry_edge;
   else
-exit_edge = create_empty_if_region_on_edge (entry_edge, cond_expr);
+exit_edge = create_empty_if_region_on_edge (entry_edge,
+		unshare_expr (cond_expr));
 
   return exit_edge;
 }
diff --git a/gcc/testsuite/gcc.dg/graphite/pr70045.c b/gcc/testsuite/gcc.dg/graphite/pr70045.c
new file mode 100644
index 000..9f98b1f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr70045.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -floop-interchange" } */
+
+int a, b, d, e, f;
+int c[9];
+void
+fn1 ()
+{
+  e = 1;
+  for (; e >= 0; e--)
+{
+  d = 1;
+  for (; d >= 0; d--)
+	{
+	  f = 0;
+	  for (; f <= 1; f++)
+	{
+	  a = 0;
+	  for (; a < 9; a++)
+		{
+		  b = 0;
+		  for (; b < 2; b++)
+		c[a + b] = 3;
+		}
+	}
+	}
+}
+}

[PATCH][AArch64] Fix gcc.target/aarch64/vect-reduc-or_1.c for -mcpu=cortex-a57

2016-03-11 Thread Kyrill Tkachov


Hi all,

I've been seeing this test FAIL for a toolchain configured with 
--with-cpu=cortex-a57 in the
scan vectoriser dump check because the cost model for -mtune=cortex-a57 decides 
not to vectorise.

This patch disables the vectoriser cost model and makes this test pass on all
configurations. I think this test is supposed to test the capabilities of the 
vectoriser
rather than the cost model decisions (we'd have to add a specific -mtune option 
otherwise)

Ok for trunk?

Thanks,
Kyrill

2016-03-11  Kyrylo Tkachov  

* gcc.target/aarch64/vect-reduc-or_1.c: Add -fno-vect-cost-model to
dg-options.
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c b/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
index cfb1231d69bbabac3da931cee2fd3fd786a29305..6261e9d1ea6fa8949d392543e08b880477a1ed5d 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 /* Write a reduction loop to be reduced using whole vector right shift.  */
 
 extern void abort (void);

Re: [PATCH][AArch64] PR target/70002: Make aarch64_set_current_function play nice with pragma resetting

2016-03-11 Thread Kyrill Tkachov



On 10/03/16 14:51, James Greenhalgh wrote:

On Thu, Mar 03, 2016 at 11:38:11AM +, Kyrill Tkachov wrote:

Hi all,

This patch fixes the ICE that was introduced by my earlier patch to 
aarch64_set_current_function:
FAIL: gcc.dg/torture/pr52429.c -O2 -flto -fno-use-linker-plugin 
-flto-partition=none (internal compiler error)

And it also fixes a bug that I was working on separately relating to popping 
pragmas.
The patch rewrites the aarch64_set_current_function implementation to be the 
same as the one in the arm port
that Christian wrote and which is simpler than the existing implementation and 
has been tested for some time
without problems. I've thought this was the way to go but was hoping to do it 
for GCC 7 instead, but I think
given the ICE we'd rather have consistent implementations of this hook between 
arm and aarch64 (and ideally
this should be moved into the midend for all targets, I don't see much 
target-specific information in the
implementation of this across the targets, but not at this stage).

Similar to that implementation the setting and restoring of the target globals 
is factored into a separate
function that is used in aarch64_set_current_function and the pragma handling 
function to tell the midend
when to reinitialise its structures.

This patch fixes the ICE, the testcase attached, and passes bootstrap and 
regression testing on
aarch64-none-linux-gnu.

Sorry for missing the ICE originally.
Ok for trunk?

OK with the typos below fixed.


Thanks, I've committed the attached as r234141.

Kyrill

2016-03-11  Kyrylo Tkachov  

PR target/70002
* config/aarch64/aarch64-protos.h
(aarch64_save_restore_target_globals): New prototype.
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
Call the above when popping pragma.
* config/aarch64/aarch64.c (aarch64_save_restore_target_globals):
New function.
(aarch64_set_current_function): Rewrite using the above.

2016-03-11  Kyrylo Tkachov  

PR target/70002
PR target/69245
* gcc.target/aarch64/pr69245_2.c: New test.

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 
3590ae0daa5d80050b0f81cd6ab9a7779f463516..e057daaec24c0add673d0b2c776d4c4c43d1f0ea
 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -178,6 +178,12 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
  
cpp_opts->warn_unused_macros = saved_warn_unused_macros;
  
+  /* If we're popping or reseting make sure to update the globals so that

+ the optab availability predicates get recomputed.  */
+  if (pop_target)
+aarch64_save_restore_target_globals (pop_target);
+
+

Extra newline.


/* Initialize SIMD builtins if we haven't already.
   Set current_target_pragma to NULL for the duration so that
   the builtin initialization code doesn't try to tag the functions
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
e4e49fc9ccc3d568c84b35c1a0c0733475017cca..c40d2b0c78494b50508c1b5135b8ee7676a61631
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -361,6 +361,7 @@ void aarch64_emit_call_insn (rtx);
  void aarch64_register_pragmas (void);
  void aarch64_relayout_simd_types (void);
  void aarch64_reset_previous_fndecl (void);
+void aarch64_save_restore_target_globals (tree);
  void aarch64_emit_approx_rsqrt (rtx, rtx);
  
  /* Initialize builtins for SIMD intrinsics.  */

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
1e10d9798ddc5f5d2aac4255d3a8fe4ecaf1402a..a05160e08d0474ed9c1e2afa1d00375839417034
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8570,6 +8570,21 @@ aarch64_reset_previous_fndecl (void)
aarch64_previous_fndecl = NULL;
  }
  
+/* Restore or save the TREE_TARGET_GLOBALS from or to NEW_TREE.

+   Used by aarch64_set_current_function and aarch64_pragma_target_parse to
+   make sure optab availability predicates are recomputed when necessary.  */
+
+void
+aarch64_save_restore_target_globals (tree new_tree)
+{
+  if (TREE_TARGET_GLOBALS (new_tree))
+restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+  else if (new_tree == target_option_default_node)
+restore_target_globals (&default_target_globals);
+  else
+TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts ();
+}
+
  /* Implement TARGET_SET_CURRENT_FUNCTION.  Unpack the codegen decisions
 like tuning and ISA features from the DECL_FUNCTION_SPECIFIC_TARGET
 of the function, if such exists.  This function may be called multiple
@@ -8579,63 +8594,32 @@ aarch64_reset_previous_fndecl (void)
  static void
  aarch64_set_current_function (tree fndecl)
  {
+  if (!fndecl || fndecl == aarch64_previous_fndecl)
+return;
+
tree old_tree = (aarch64_previous_fndecl
   ? DECL_FUNCTION_SPECIFIC_TARGET (aarch64_previous_fndecl)
   : NULL_TREE);
  
-  tree ne

[PATCH][ARM] PR driver/70132: Avoid double fclose in driver-arm.c

2016-03-11 Thread Kyrill Tkachov


Hi all,

As reported in the PR we can end up calling fclose twice on a file, causing an 
error.
This patch fixes that by reorganising the logic a bit to ensure we return after 
closing
the file the first time.

Bootstrapped and tested on arm-none-linux-gnueabihf

Ok for trunk?

Thanks,
Kyrill

2016-03-11  Kyrylo Tkachov  

PR driver/70132
* config/arm/driver-arm.c (host_detect_local_cpu): Set file pointer
to NULL after closing file.
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index 466743b9d47f4144b6aade23e3f311405736ffa2..95dc9d53b6c179946d62f45b2b0d4a21960405b8 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -128,12 +128,11 @@ host_detect_local_cpu (int argc, const char **argv)
 	}
 }
 
-  fclose (f);
-
-  if (val == NULL)
-goto not_found;
-
-  return concat ("-m", argv[0], "=", val, NULL);
+  if (val)
+{
+  fclose (f);
+  return concat ("-m", argv[0], "=", val, NULL);
+ }
 
 not_found:
   {

Re: [PATCH][AArch64] Fix gcc.target/aarch64/vect-reduc-or_1.c for -mcpu=cortex-a57

2016-03-11 Thread James Greenhalgh

On Fri, Mar 11, 2016 at 03:19:54PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> I've been seeing this test FAIL for a toolchain configured with
> --with-cpu=cortex-a57 in the scan vectoriser dump check because the cost
> model for -mtune=cortex-a57 decides not to vectorise.
> 
> This patch disables the vectoriser cost model and makes this test pass on all
> configurations. I think this test is supposed to test the capabilities of the
> vectoriser rather than the cost model decisions (we'd have to add a specific
> -mtune option otherwise)
> 
> Ok for trunk?

Looks like the obvious fix to the problem you describe.

OK.

Thanks,
James

> 2016-03-11  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/vect-reduc-or_1.c: Add -fno-vect-cost-model to
> dg-options.

Re: Add C++ special math functions to C++17

2016-03-11 Thread Jonathan Wakely

The change approved in Jacksonville was to only add the special
functions to  and not

RFA [Patch] PR 45076 - [OOP] gfortran.dg/dynamic_dispatch_6.f03 ICEs with -fprofile-use

2016-03-11 Thread Dominique d'Humières

AFAICT pr45076 is fixed on the gcc-4.9, gcc-5 branches, and trunk. I have 
borrowed the machinery in g++.dg/tree-prof/tree-prof.exp for the attached patch 
and tested it on the three branches. Is it OK as such or is there a better way 
to do the testing?

TIA

Dominique



patch-45076
Description: Binary data

Re: RFA [Patch] PR 45076 - [OOP] gfortran.dg/dynamic_dispatch_6.f03 ICEs with -fprofile-use

2016-03-11 Thread Mike Stump

On Mar 11, 2016, at 7:57 AM, Dominique d'Humières  wrote:
> AFAICT pr45076 is fixed on the gcc-4.9, gcc-5 branches, and trunk. I have 
> borrowed the machinery in g++.dg/tree-prof/tree-prof.exp for the attached 
> patch and tested it on the three branches. Is it OK as such or is there a 
> better way to do the testing?

Ok.  [ As always, anyone should feel free to chime in with suggestions, 
objections and improvements, if they know of any. ]

Re: Add C++ special math functions to C++17

2016-03-11 Thread Ed Smith-Rowland


On 03/11/2016 10:55 AM, Jonathan Wakely wrote:

The change approved in Jacksonville was to only add the special
functions to  and not 


That's easy.
OK, since they changed that and the macro and made it nonconditional I 
should also drop the old-style macros __WANT_MATH_CANNEVERREMEMBER__ and 
the old-style version macro.
So, in other words, we're *not* actually supporting TR29124 (that's fine 
by me).

We can tweak the web pages.
People on C++11, C++14 can use tr1.

Did they keep the Cisms like:

  float
  foobarf(float x);

  double
  foobar(double x);

  long double
  foobarl(long double x);

I am keeping an eye out for the words of the draft github.
Maybe I'll ping Axel and Walter.

Ed

Re: Add C++ special math functions to C++17

2016-03-11 Thread Jonathan Wakely

On 11 March 2016 at 16:31, Ed Smith-Rowland <3dw...@verizon.net> wrote:
> On 03/11/2016 10:55 AM, Jonathan Wakely wrote:
>>
>> The change approved in Jacksonville was to only add the special
>> functions to  and not 
>>
> That's easy.
> OK, since they changed that and the macro and made it nonconditional I
> should also drop the old-style macros __WANT_MATH_CANNEVERREMEMBER__ and the
> old-style version macro.
> So, in other words, we're *not* actually supporting TR29124 (that's fine by
> me).
> We can tweak the web pages.
> People on C++11, C++14 can use tr1.

I think we can keep 29124 support for pre-C++17, and for that case
 will include the functions (when the ICANNEVERREMEMBEREITHER
macro is defined).

> Did they keep the Cisms like:
>
>   float
>   foobarf(float x);
>
>   double
>   foobar(double x);
>
>   long double
>   foobarl(long double x);

Yes, I think so.

Let's do this after gcc6 though.

Re: [PATCH] Turn some compile-time tests into run-time tests

2016-03-11 Thread Patrick Palka

On Thu, Mar 10, 2016 at 6:38 PM, Patrick Palka  wrote:
> I ran the command
>
>   git grep -l "dg-do compile" | xargs grep -l __builtin_abort | xargs grep 
> -lw main
>
> to find tests marked as compile-time tests that likely ought to instead
> be marked as run-time tests, by the rationale that they use
> __builtin_abort and they also define main().  (I also then confirmed that they
> compile, link and run cleanly on my machine.)
>
> After this patch, the remaining test files reported by the above command
> are:
>
>   These do not define all the functions they use:
> gcc/testsuite/g++.dg/ipa/devirt-41.C
> gcc/testsuite/g++.dg/ipa/devirt-44.C
> gcc/testsuite/g++.dg/ipa/devirt-45.C
> gcc/testsuite/gcc.target/i386/pr55672.c
>
>   These are non-x86 tests so I can't confirm that they run cleanly:
> gcc/testsuite/gcc.target/arm/pr58041.c
> gcc/testsuite/gcc.target/powerpc/pr35907.c
> gcc/testsuite/gcc.target/s390/dwarfregtable-1.c
> gcc/testsuite/gcc.target/s390/dwarfregtable-2.c
> gcc/testsuite/gcc.target/s390/dwarfregtable-3.c
>
>   These use dg-error:
> libstdc++-v3/testsuite/20_util/forward/c_neg.cc
> libstdc++-v3/testsuite/20_util/forward/f_neg.cc
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK to
> commit?  Does anyone have another heuristic one can use to help find
> these kinds of typos?
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp0x/constexpr-aggr2.C: Make it a run-time test.
> * g++.dg/cpp0x/nullptr32.C: Likewise.
> * g++.dg/cpp1y/digit-sep-cxx11-neg.C: Likewise.
> * g++.dg/cpp1y/digit-sep.C: Likewise.
> * g++.dg/ext/flexary13.C: Likewise.
> * gcc.dg/alias-14.c: Likewise.
> * gcc.dg/ipa/PR65282.c: Likewise.
> * gcc.dg/pr69644.c: Likewise.
> * gcc.dg/tree-ssa/pr38533.c: Likewise.
> * gcc.dg/tree-ssa/pr61385.c: Likewise.

Here's another I found:

diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-return1.C
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-return1.C
index 4b353b6..ea7ae6f 100644
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-return1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-return1.C
@@ -1,5 +1,5 @@
 // PR c++/57437
-// { dg-require-effective-target c++11 }
+// { dg-do run { target c++11 } }

 struct A {
   int i;

[PATCH] PR driver/70192: Properly set flag_pie and flag_pic

2016-03-11 Thread H.J. Lu

We can't set flag_pie to the default when flag_pic == 0, which may be
set by -fno-pic or -fno-PIC, since the default value of flag_pie is
non-zero when GCC is configured with --enable-default-pie.  We need
to initialize flag_pic to -1 so that we can tell if -fpic, -fPIC,
-fno-pic or -fno-PIC is used.

OK for trunk?


H.J.
---
gcc/

PR driver/70192
* opts.c (finish_options): Don't set flag_pie to the default if
-fpic, -fPIC, -fno-pic or -fno-PIC is used.  Set flag_pic to 0
if it is -1.

gcc/testsuite/

PR driver/70192
* gcc.dg/pic-1.c: New test.
* gcc.dg/pic-2.c: Likewise.
* gcc.dg/pic-3.c: Likewise.
* gcc.dg/pic-4.c: Likewise.
* gcc.dg/pie-1.c: Likewise.
* gcc.dg/pie-2.c: Likewise.
* gcc.dg/pie-3.c: Likewise.
* gcc.dg/pie-4.c: Likewise.
* gcc.dg/pie-5.c: Likewise.
* gcc.dg/pie-6.c: Likewise.
---
 gcc/common.opt   |  4 ++--
 gcc/opts.c   |  7 ++-
 gcc/testsuite/gcc.dg/pic-1.c | 10 ++
 gcc/testsuite/gcc.dg/pic-2.c | 10 ++
 gcc/testsuite/gcc.dg/pic-3.c | 10 ++
 gcc/testsuite/gcc.dg/pic-4.c | 10 ++
 gcc/testsuite/gcc.dg/pie-1.c | 10 ++
 gcc/testsuite/gcc.dg/pie-2.c | 10 ++
 gcc/testsuite/gcc.dg/pie-3.c | 10 ++
 gcc/testsuite/gcc.dg/pie-4.c | 10 ++
 gcc/testsuite/gcc.dg/pie-5.c | 10 ++
 gcc/testsuite/gcc.dg/pie-6.c | 10 ++
 12 files changed, 108 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pic-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pic-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pic-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pic-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-6.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 1c8cc8e..67048db 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1840,7 +1840,7 @@ Common Report Var(flag_peephole2) Optimization
 Enable an RTL peephole pass before sched2.
 
 fPIC
-Common Report Var(flag_pic,2) Negative(fPIE)
+Common Report Var(flag_pic,2) Negative(fPIE) Init(-1)
 Generate position-independent code if possible (large mode).
 
 fPIE
@@ -1848,7 +1848,7 @@ Common Report Var(flag_pie,2) Negative(fpic) Init(-1)
 Generate position-independent code for executables if possible (large mode).
 
 fpic
-Common Report Var(flag_pic,1) Negative(fpie)
+Common Report Var(flag_pic,1) Negative(fpie) Init(-1)
 Generate position-independent code if possible (small mode).
 
 fpie
diff --git a/gcc/opts.c b/gcc/opts.c
index 2f45312..0f9431a 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -766,13 +766,18 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
 default value.  */
   if (opts->x_flag_pie == -1)
{
- if (opts->x_flag_pic == 0)
+ /* We initialize opts->x_flag_pic to -1 so that we can tell if
+-fpic, -fPIC, -fno-pic or -fno-PIC is used.  */
+ if (opts->x_flag_pic == -1)
opts->x_flag_pie = DEFAULT_FLAG_PIE;
  else
opts->x_flag_pie = 0;
}
+  /* If -fPIE or -fpie is used, turn on PIC.  */
   if (opts->x_flag_pie)
opts->x_flag_pic = opts->x_flag_pie;
+  else if (opts->x_flag_pic == -1)
+   opts->x_flag_pic = 0;
   if (opts->x_flag_pic && !opts->x_flag_pie)
opts->x_flag_shlib = 1;
   opts->x_flag_opts_finished = true;
diff --git a/gcc/testsuite/gcc.dg/pic-1.c b/gcc/testsuite/gcc.dg/pic-1.c
new file mode 100644
index 000..7eb0765
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pic-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fpic" } */
+
+#if __PIC__ != 1
+# error __PIC__ is not 1!
+#endif
+
+#ifdef __PIE__
+# error __PIE__ is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/pic-2.c b/gcc/testsuite/gcc.dg/pic-2.c
new file mode 100644
index 000..2c742e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pic-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fPIC" } */
+
+#if __PIC__ != 2
+# error __PIC__ is not 2!
+#endif
+
+#ifdef __PIE__
+# error __PIE__ is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/pic-3.c b/gcc/testsuite/gcc.dg/pic-3.c
new file mode 100644
index 000..d7d861b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pic-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-pic" } */
+
+#ifdef __PIC__
+# error __PIC__ is defined!
+#endif
+
+#ifdef __PIE__
+# error __PIE__ is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/pic-4.c b/gcc/testsuite/gcc.dg/pic-4.c
new file mode 100644
index 000..732f61f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pic-4.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-PIC" } */
+
+#ifdef __PIC__
+# error __PIC__ is defined!

[gomp4] various backports from trunk

2016-03-11 Thread Cesar Philippidis

I've applied this patch which backports my recent trunk changes to
gomp-4_0-branch. Specifically, this patch contains

 * nvptx vector state propagation fix, which includes the updated test
   fix for pr70009

 * combined loop clauses fix

Cesar
2016-03-11  Cesar Philippidis  

	gcc/c/
	* c-parser.c (c_parser_oacc_loop): Update cclauses and clauses
	when calling c_finish_omp_clauses.

	gcc/
	* config/nvptx/nvptx.c (nvptx_gen_shuffle): Add support for QImode
	and HImode registers.

	gcc/cp/
	* parser.c (cp_parser_oacc_loop): Update cclauses and clauses
	when calling c_finish_omp_clauses.

	gcc/testsuite/
	* c-c++-common/goacc/combined-directives-2.c: New test.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/vprop.c: New test.


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index bbbe26b..5e5f60d 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -13960,9 +13960,9 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 {
   clauses = c_oacc_split_loop_clauses (clauses, cclauses);
   if (*cclauses)
-	c_finish_omp_clauses (*cclauses, true, false);
+	*cclauses = c_finish_omp_clauses (*cclauses, true, false);
   if (clauses)
-	c_finish_omp_clauses (clauses, true, false);
+	clauses = c_finish_omp_clauses (clauses, true, false);
 }
 
   tree block = c_begin_compound_stmt (true);
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 492ebd1..5f10a65 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1301,6 +1301,20 @@ nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, nvptx_shuffle_kind kind)
 	end_sequence ();
   }
   break;
+case QImode:
+case HImode:
+  {
+	rtx tmp = gen_reg_rtx (SImode);
+
+	start_sequence ();
+	emit_insn (gen_rtx_SET (tmp, gen_rtx_fmt_e (ZERO_EXTEND, SImode, src)));
+	emit_insn (nvptx_gen_shuffle (tmp, tmp, idx, kind));
+	emit_insn (gen_rtx_SET (dst, gen_rtx_fmt_e (TRUNCATE, GET_MODE (dst),
+		tmp)));
+	res = get_insns ();
+	end_sequence ();
+  }
+  break;
   
 default:
   gcc_unreachable ();
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c90e270..9d70ff7 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -35482,9 +35482,9 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 {
   clauses = c_oacc_split_loop_clauses (clauses, cclauses);
   if (*cclauses)
-	finish_omp_clauses (*cclauses, true, true);
+	*cclauses = finish_omp_clauses (*cclauses, true, true);
   if (clauses)
-	finish_omp_clauses (clauses, true, true);
+	clauses = finish_omp_clauses (clauses, true, true);
 }
 
   tree block = begin_omp_structured_block ();
diff --git a/gcc/testsuite/c-c++-common/goacc/combined-directives-2.c b/gcc/testsuite/c-c++-common/goacc/combined-directives-2.c
new file mode 100644
index 000..c51e2f9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/combined-directives-2.c
@@ -0,0 +1,14 @@
+/* Ensure that bogus clauses aren't propagated in combined loop
+   constructs.  */
+
+int
+main ()
+{
+  int a, i;
+
+#pragma acc parallel loop vector copy(a[0:100]) reduction(+:a) /* { dg-error "'a' does not have pointer or array type" } */
+  for (i = 0; i < 100; i++)
+a++;
+
+  return a;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c
new file mode 100644
index 000..17b9568
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c
@@ -0,0 +1,34 @@
+#include 
+
+#define test(type)\
+void		\
+test_##type ()	\
+{		\
+  signed type b[100];\
+  signed type i, j, x = -1, y = -1;		\
+		\
+  _Pragma("acc parallel loop copyout (b)")	\
+  for (j = 0; j > -5; j--)			\
+{		\
+  type c = x+y; \
+  _Pragma("acc loop vector")		\
+  for (i = 0; i < 20; i++)			\
+	b[-j*20 + i] = c;			\
+  b[5-j] = c;   \
+}		\
+		\
+  for (i = 0; i < 100; i++)			\
+assert (b[i] == -2);			\
+}
+
+test(char)
+test(short)
+
+int
+main ()
+{
+  test_char ();
+  test_short ();
+
+  return 0;
+}

Re: [C++ PATCH] Reuse certain cxx_eval_constant_expression results in cxx_eval_vec_init_1 (PR c++/70001)

2016-03-11 Thread Jakub Jelinek

On Fri, Mar 11, 2016 at 09:27:54AM -0500, Jason Merrill wrote:
> On 03/10/2016 01:39 PM, Jakub Jelinek wrote:
> >+  /* Don't reuse the result of cxx_eval_constant_expression
> >+ call if it isn't a constant initializer or if it requires
> >+ relocations.  */
> 
> Let's phrase this positively ("Reuse the result if...").
> 
> >+  if (new_ctx.ctor != ctx->ctor)
> >+eltinit = new_ctx.ctor;
> >+  for (i = 1; i < max; ++i)
> >+{
> >+  idx = build_int_cst (size_type_node, i);
> >+  CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
> >+}
> 
> This is going to use the same CONSTRUCTOR for all the elements, which will
> lead to problems if we then store into a subobject of one of the elements
> and see that reflected in all the others as well.  We need to unshare_expr
> when reusing the initializer.

Ok, here is a new version with unshare_expr and reworded comment.
I haven't been successful with writing a testcase where the unshare_expr
would matter, but have included what I've tried in the patch anyway.
Compile-time wise the unshare_expr is not very costly, and e.g. for the
other testcase in this patch gives a nice compile time speedup.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-11  Jakub Jelinek  

PR c++/70001
* constexpr.c (cxx_eval_vec_init_1): Reuse CONSTRUCTOR initializers
for 1..max even for multi-dimensional arrays.  Call unshare_expr
on it.

* g++.dg/cpp0x/constexpr-70001-4.C: New test.
* g++.dg/cpp1y/pr70001.C: New test.

--- gcc/cp/constexpr.c.jj   2016-03-10 12:52:04.0 +0100
+++ gcc/cp/constexpr.c  2016-03-11 19:24:28.435537864 +0100
@@ -2340,7 +2340,6 @@ cxx_eval_vec_init_1 (const constexpr_ctx
   vec **p = &CONSTRUCTOR_ELTS (ctx->ctor);
   vec_alloc (*p, max + 1);
   bool pre_init = false;
-  tree pre_init_elt = NULL_TREE;
   unsigned HOST_WIDE_INT i;
 
   /* For the default constructor, build up a call to the default
@@ -2370,6 +2369,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx
 {
   tree idx = build_int_cst (size_type_node, i);
   tree eltinit;
+  bool reuse = false;
   constexpr_ctx new_ctx;
   init_subob_ctx (ctx, new_ctx, idx, pre_init ? init : elttype);
   if (new_ctx.ctor != ctx->ctor)
@@ -2378,7 +2378,10 @@ cxx_eval_vec_init_1 (const constexpr_ctx
{
  /* A multidimensional array; recurse.  */
  if (value_init || init == NULL_TREE)
-   eltinit = NULL_TREE;
+   {
+ eltinit = NULL_TREE;
+ reuse = i == 0;
+   }
  else
eltinit = cp_build_array_ref (input_location, init, idx,
  tf_warning_or_error);
@@ -2390,18 +2393,9 @@ cxx_eval_vec_init_1 (const constexpr_ctx
{
  /* Initializing an element using value or default initialization
 we just pre-built above.  */
- if (pre_init_elt == NULL_TREE)
-   pre_init_elt
- = cxx_eval_constant_expression (&new_ctx, init, lval,
- non_constant_p, overflow_p);
- eltinit = pre_init_elt;
- /* Don't reuse the result of cxx_eval_constant_expression
-call if it isn't a constant initializer or if it requires
-relocations.  */
- if (initializer_constant_valid_p (pre_init_elt,
-   TREE_TYPE (pre_init_elt))
- != null_pointer_node)
-   pre_init_elt = NULL_TREE;
+ eltinit = cxx_eval_constant_expression (&new_ctx, init, lval,
+ non_constant_p, overflow_p);
+ reuse = i == 0;
}
   else
{
@@ -2427,6 +2421,23 @@ cxx_eval_vec_init_1 (const constexpr_ctx
}
   else
CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
+  /* Reuse the result of cxx_eval_constant_expression call
+ from the first iteration to all others if it is a constant
+ initializer that doesn't require relocations.  */
+  if (reuse
+ && max > 1
+ && (initializer_constant_valid_p (eltinit, TREE_TYPE (eltinit))
+ == null_pointer_node))
+   {
+ if (new_ctx.ctor != ctx->ctor)
+   eltinit = new_ctx.ctor;
+ for (i = 1; i < max; ++i)
+   {
+ idx = build_int_cst (size_type_node, i);
+ CONSTRUCTOR_APPEND_ELT (*p, idx, unshare_expr (eltinit));
+   }
+ break;
+   }
 }
 
   if (!*non_constant_p)
--- gcc/testsuite/g++.dg/cpp0x/constexpr-70001-4.C.jj   2016-03-10 
19:28:13.386481311 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-70001-4.C  2016-03-10 
19:28:43.295074924 +0100
@@ -0,0 +1,13 @@
+// PR c++/70001
+// { dg-do compile { target c++11 } }
+
+struct B
+{
+  int a;
+  constexpr B () : a (0) { }
+};
+
+struct A
+{
+  B b[1 << 19][1][1][1];
+} c;
--- gcc/testsuite/g++.dg/cpp1y/pr70001.C.j

[committed 1/2] Wmisleading-indentation: add reproducer for PR c/70085

2016-03-11 Thread David Malcolm

PR c/70085 reported a false-positive from -Wmisleading-indentation.

The warning was fixed by the fix for PR c/68187 (r233972), but it seems
worth capturing the reproducer for PR c/70085 as an additional test case,
as it's slightly different to those seen in PR c/68187.

Committed to trunk (as "obvious") as r234145.

gcc/testsuite/ChangeLog:
PR c/70085
* c-c++-common/Wmisleading-indentation.c (pr70085): New test case.
---
 gcc/testsuite/c-c++-common/Wmisleading-indentation.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c 
b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 7b499d4..38c8aec 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -1054,3 +1054,19 @@ fn_42_c (int locked, int i)
 return 0;
 #undef engine_ref_debug
 }
+
+/* We shouldn't complain about the following function.  */
+#define ENABLE_FEATURE
+int pr70085 (int x, int y)
+{
+  if (x > y)
+return x - y;
+
+  #ifdef ENABLE_FEATURE
+if (x == y)
+  return 0;
+  #endif
+
+  return -1;
+}
+#undef ENABLE_FEATURE
-- 
1.8.5.3

[committed 2/2] Wmisleading-indentation.c: add more test cases for PR c/68187

2016-03-11 Thread David Malcolm

I posted a series of tests here:
  https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00271.html
as part of the discussion around PR c/68187.

I've cleaned them into DejaGnu form and added them to the existing
test file; they add 16 PASS results to gcc.sum and 48 PASS results
to g++.sum.

Committed to trunk as r234146 (as "obvious").

gcc/testsuite/ChangeLog:
PR c/68187
* c-c++-common/Wmisleading-indentation.c (test43_a): New test
case.
(test43_b): Likewise.
(test43_c): Likewise.
(test43_d): Likewise.
(test43_e): Likewise.
(test43_f): Likewise.
(test43_g): Likewise.
(test44_a): Likewise.
(test44_b): Likewise.
(test44_c): Likewise.
(test44_d): Likewise.
(test44_e): Likewise.
---
 .../c-c++-common/Wmisleading-indentation.c | 168 +
 1 file changed, 168 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c 
b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 38c8aec..ba512e7 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -1070,3 +1070,171 @@ int pr70085 (int x, int y)
   return -1;
 }
 #undef ENABLE_FEATURE
+
+/* Additional test coverage for PR c/68187, with various locations for a
+   pair of aligned statements ("foo (2);" and "foo (3);") that may or may
+   not be misleadingly indented.  */
+
+/* Before the "}".
+
+   The two statements aren't visually "within" the above line, so we
+   shouldn't warn.  */
+
+void
+test43_a (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB)
+ foo (2);
+ foo (3);
+}
+
+/* Aligned with the "}".
+
+   Again, the two statements aren't visually "within" the above line, so we
+   shouldn't warn.  */
+
+void
+test43_b (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB)
+  foo (2);
+  foo (3);
+}
+
+/* Indented between the "}" and the "else".
+
+   The two statements are indented "within" the line above, so appear that
+   they would be guarded together.  We should warn about this.  */
+
+void
+test43_c (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB) /* { dg-message "...this .if. clause" } */
+   foo (2);
+   foo (3); /* { dg-warning "statement is indented" } */
+}
+
+/* Aligned with the "else".  Likewise, we should warn.  */
+
+void
+test43_d (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB) /* { dg-message "...this .if. clause" } */
+foo (2);
+foo (3); /* { dg-warning "statement is indented" } */
+}
+
+/* Indented between the "else" and the "if".  Likewise, we should warn.  */
+
+void
+test43_e (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB) /* { dg-message "...this .if. clause" } */
+  foo (2);
+  foo (3); /* { dg-warning "statement is indented" } */
+}
+
+/* Aligned with the "if".  Likewise, we should warn.  */
+
+void
+test43_f (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB) /* { dg-message "...this .else. clause" } */
+ foo (2);
+ foo (3); /* { dg-warning "statement is indented" } */
+}
+
+/* Indented more than the "if".  Likewise, we should warn.  */
+
+void
+test43_g (void)
+{
+  if (flagA) {
+foo (1);
+  } else if (flagB) /* { dg-message "...this .if. clause" } */
+foo (2);
+foo (3); /* { dg-warning "statement is indented" } */
+}
+
+/* Again, but without the 2nd "if".  */
+
+/* Before the "}".
+
+   As before, the two statements aren't visually "within" the above line,
+   so we shouldn't warn.  */
+
+void
+test44_a (void)
+{
+  if (flagA) {
+foo (1);
+  } else
+ foo (2);
+ foo (3);
+}
+
+/* Aligned with the "}".
+
+   As before, the two statements aren't visually "within" the above line,
+   so we shouldn't warn.  */
+
+void
+test44_b (void)
+{
+  if (flagA) {
+foo (1);
+  } else
+  foo (2);
+  foo (3);
+}
+
+/* Indented between the "}" and the "else".
+
+   The two statements are indented "within" the line above, so appear that
+   they would be guarded together.  We should warn about this.  */
+
+void
+test44_c (void)
+{
+  if (flagA) {
+foo (1);
+  } else  /* { dg-message "...this .else. clause" } */
+   foo (2);
+   foo (3);  /* { dg-warning "statement is indented" } */
+}
+
+/* Aligned with the "else".  Likewise, we should warn.  */
+
+void
+test44_d (void)
+{
+  if (flagA) {
+foo (1);
+  } else  /* { dg-message "...this .else. clause" } */
+foo (2);
+foo (3);  /* { dg-warning "statement is indented" } */
+}
+
+/* Indented more than the "else".  Likewise, we should warn.  */
+
+void
+test44_e (void)
+{
+  if (flagA) {
+foo (1);
+  } else  /* { dg-message "...this .else. clause" } */
+foo (2);
+foo (3);  /* { dg-warning "statement is indented" } */
+}
-- 
1.8.5.3

Re: [PATCH] PR69195, Reload confused by invalid reg equivs

2016-03-11 Thread Andreas Schwab

I'm getting this crash on ia64 for gcc.c-torture/compile/pr58164.c:

Program received signal SIGSEGV, Segmentation fault.
0x412286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
3865  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
(gdb) bt
#0  0x412286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
#1  0x41239570 in ira (f=0x0) at ../../gcc/ira.c:5188
#2  0x4123a6d0 in (anonymous namespace)::pass_ira::execute (
this=0x602e2f50) at ../../gcc/ira.c:5539
Python Exception  'tuple' object has no 
attribute 'major': 
#3  0x41628ab0 in execute_one_pass (pass=) at ../../gcc/passes.c:2336
Python Exception  'tuple' object has no 
attribute 'major': 
#4  0x41629aa0 in execute_pass_list_1 (pass=)
at ../../gcc/passes.c:2410
Python Exception  'tuple' object has no 
attribute 'major': 
#5  0x41629bb0 in execute_pass_list_1 (pass=)
at ../../gcc/passes.c:2411
Python Exception  'tuple' object has no 
attribute 'major': 
#6  0x41629cf0 in execute_pass_list (fn=0x2104f0d8, pass=)
at ../../gcc/passes.c:2421
Python Exception  'tuple' object has no 
attribute 'major': 
#7  0x40917ee0 in cgraph_node::expand (this=)
at ../../gcc/cgraphunit.c:1980
#8  0x409196c0 in output_in_order (no_reorder=false)
at ../../gcc/cgraphunit.c:2218
#9  0x40923790 in symbol_table::compile (this=0x20fb)
at ../../gcc/cgraphunit.c:2466
#10 0x40926ae0 in symbol_table::finalize_compilation_unit (
this=0x20fb) at ../../gcc/cgraphunit.c:2562
#11 0x41a3ea80 in compile_file () at ../../gcc/toplev.c:490
#12 0x41a44870 in do_compile () at ../../gcc/toplev.c:1988
#13 0x41a45120 in toplev::main (this=0x600ef090, argc=2, 
argv=0x600ef348) at ../../gcc/toplev.c:2096
#14 0x4318cad0 in main (argc=2, argv=0x600ef348)
at ../../gcc/main.c:39

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [PATCH][GCC 7] Fix PR70171

2016-03-11 Thread Eric Botcazou

> The following teaches phiprop to handle the case of aggregate copies
> where the aggregate has non-BLKmode which means it is very likely
> expanded as reg-reg moves (any better test for that apart from
> checking for non-BLKmode?).  

!aggregate_value_p comes to mind, but non-BLKmode is the definitive test to 
distinguish the register from the non-register case at the RTL level.

-- 
Eric Botcazou

Re: [RFA][PATCH][PR tree-optimization/64058] Improve and stabilize sorting of coalesce pairs

2016-03-11 Thread Jeff Law


On 03/11/2016 03:02 AM, Richard Biener wrote:
[Big snip]



Can you please split out the 'index' introduction as a separate patch
and apply that?
I think it is quite obviously a good idea and might make regression
hunting easier later if needed.
Done.  Actual patch installed attached for archival purposes.   I'll 
address your comments on the rest of the patch independently.


Jeff

commit 0171eb6090691291571a8db83a5789ecac118e57
Author: law 
Date:   Fri Mar 11 21:07:31 2016 +

PR tree-optimization/64058
* tree-ssa-coalesce.c (struct coalesce_pair): Add new field INDEX.
(num_coalesce_pairs): Move up earlier in file.
(find_coalesce_pair): Initialize the INDEX field for each pair
discovered.
(compare_pairs): No longer sort on the elements in each pair.
Instead break ties with the index of the coalesce pair.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234149 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c69c753..f3a7351 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2016-03-11  Jeff Law  
+
+   PR tree-optimization/64058
+   * tree-ssa-coalesce.c (struct coalesce_pair): Add new field INDEX.
+   (num_coalesce_pairs): Move up earlier in file.
+   (find_coalesce_pair): Initialize the INDEX field for each pair
+   discovered.
+   (compare_pairs): No longer sort on the elements in each pair.
+   Instead break ties with the index of the coalesce pair.
+
 2016-03-11  Kyrylo Tkachov  
 
PR target/70002
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 6624e7e..e49876e 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -50,6 +50,11 @@ struct coalesce_pair
   int first_element;
   int second_element;
   int cost;
+
+  /* The order in which coalescing pairs are discovered is recorded in this
+ field, which is used as the final tie breaker when sorting coalesce
+ pairs.  */
+  int index;
 };
 
 /* Coalesce pair hashtable helpers.  */
@@ -254,6 +259,13 @@ delete_coalesce_list (coalesce_list *cl)
   free (cl);
 }
 
+/* Return the number of unique coalesce pairs in CL.  */
+
+static inline int
+num_coalesce_pairs (coalesce_list *cl)
+{
+  return cl->list->elements ();
+}
 
 /* Find a matching coalesce pair object in CL for the pair P1 and P2.  If
one isn't found, return NULL if CREATE is false, otherwise create a new
@@ -290,6 +302,7 @@ find_coalesce_pair (coalesce_list *cl, int p1, int p2, bool 
create)
   pair->first_element = p.first_element;
   pair->second_element = p.second_element;
   pair->cost = 0;
+  pair->index = num_coalesce_pairs (cl);
   *slot = pair;
 }
 
@@ -343,29 +356,14 @@ compare_pairs (const void *p1, const void *p2)
   int result;
 
   result = (* pp1)->cost - (* pp2)->cost;
-  /* Since qsort does not guarantee stability we use the elements
- as a secondary key.  This provides us with independence from
- the host's implementation of the sorting algorithm.  */
+  /* And if everything else is equal, then sort based on which
+ coalesce pair was found first.  */
   if (result == 0)
-{
-  result = (* pp2)->first_element - (* pp1)->first_element;
-  if (result == 0)
-   result = (* pp2)->second_element - (* pp1)->second_element;
-}
+result = (*pp2)->index - (*pp1)->index;
 
   return result;
 }
 
-
-/* Return the number of unique coalesce pairs in CL.  */
-
-static inline int
-num_coalesce_pairs (coalesce_list *cl)
-{
-  return cl->list->elements ();
-}
-
-
 /* Iterate over CL using ITER, returning values in PAIR.  */
 
 #define FOR_EACH_PARTITION_PAIR(PAIR, ITER, CL)\

Re: [PATCH][wwwdocs] GCC 6 supports musl libc on Linux

2016-03-11 Thread Jeff Law


On 03/08/2016 11:27 AM, Szabolcs Nagy wrote:

I'd like to mention musl libc support in the gcc 6 release notes.

(added under a linux section since only linux targets are supported now.)

Is it ok to commit?



Yes.

jeff

Re: LRA remat issue with hard regs (PR70123)

2016-03-11 Thread Jeff Law


On 03/10/2016 11:10 AM, Bernd Schmidt wrote:

When I submitted my previous lra-remat patch, I mentioned I had some
concerns about the way we dealt with register number comparisons, but I
didn't want to change things blindly without a testcase. PR70123 has now
provided such a testcase where we are trying to rematerialize a hard
register (r6). While scanning we encounter an instruction of the form
  (set (reg 285) (reg 272))
i.e. involving only pseudos, but reg_renumber[285] is r6. Since we only
compare register numbers, we do not notice that the hard reg is clobbered.

The following patch modifies the function input_regno_present_p, and
also renames it so that its purpose is more obvious to someone familiar
with other parts of gcc. I've made it look at reg_renumber, and also try
to deal with multi-word hard registers properly.

I'm not entirely sure this is a fully safe approach however, since I
can't yet answer the question of whether LRA could change another pseudo
to reside in hard register 6, thereby making the rematerialization
invalid after the fact. Therefore the patch also includes a change to
just disable candidates if they involve hard registers. I haven't
observed that making any difference in code generation (on x86_64),
beyond fixing the testcase on s390.

Bootstrapped and tested on x86_64-linux; Jakub verified that the
testcase works afterwards. Ok for trunk and 5-branch, either for one or
for both parts? I'm hoping the testcase in gcc.dg/torture will get
exercised in the right way on s390, but I haven't run tests on that
machine.


Bernd

remat-hardregs.diff


PR target/70123
* lra-remat.c (operand_to_remat): Disallow hard regs in the value t
be rematerialized.
(reg_overlap_for_remat_p): Renamed from input_regno_present_p.
Arguments swapped.  All callers changed.  Take reg_renumber into
account, and Calculate and compare register ranges for hard regs.

PR target/70123
* gcc.dg/torture/pr70123.c: New test.
OK.  Like you I'm not sure if the operand_to_remat test is strictly 
necessary, but I can't see how it'll ever generate incorrect code.


The change to reg_overlap_for_remat_p looks good and should be strictly 
an improvement from a correctness standpoint.


I think both are OK for the trunk and the gcc-5 branch after a bit of 
soaking on the trunk.


jeff

Re: [PATCH] libcc1: rerun configure when gcc/BASE-VER changes

2016-03-11 Thread Jeff Law


On 03/08/2016 07:17 AM, Andreas Schwab wrote:

This is needed to get gcc_version updated.

Andreas.

* configure.ac (CONFIG_STATUS_DEPENDENCIES): Substitute.
* configure: Regenerate.
* Makefile.in: Regenerate.

OK.

jeff

Re: Fix 69650, bogus line numbers from libcpp

2016-03-11 Thread David Malcolm

On Thu, 2016-03-10 at 09:40 +0100, Bernd Schmidt wrote:
> This is a case where bogus #line directives can confuse libcpp into 
> producing nonsensical line numbers, even leading to a crash later on
> in LTO.
> 
> The following patch moves the test earlier to a point where we can
> more 
> easily recover from the error condition. I should note that I changed
> the raw fprintf (stderr) to a cpp_error call, which is a slight
> change 
> in behaviour (we don't even get to LTO anymore due to erroring out
> earlier).
> 
> Bootstrapped and tested on x86_64-linux (as always including Ada,
> which 
> failed with an earlier version of the patch). Ok?

Thanks for looking at this.

> --- libcpp/directives.c   (revision 234025)
> +++ libcpp/directives.c   (working copy)
> @@ -1046,6 +1046,19 @@ do_linemarker (cpp_reader *pfile)
>  
>skip_rest_of_line (pfile);
>  
> +  if (reason == LC_LEAVE)
> +{
> +  const line_map_ordinary *from;  
> +  if (MAIN_FILE_P (map)
> +   || (new_file
> +   && (from = INCLUDED_FROM (pfile->line_table, map)) !=
NULL
> +   && filename_cmp (ORDINARY_MAP_FILE_NAME (from),
new_file) != 0))
> + {
> +   cpp_error (pfile, CPP_DL_ERROR,
> +  "file \"%s\" left but not entered", new_file);
 
Although it looks like you're preserving the existing behavior from
when this was in linemap_add, shouldn't this be
  ORDINARY_MAP_FILE_NAME (from)
rather than new_file?  (i.e. shouldn't it report the name of the file
being *left*, rather than the one being entered?)

[...]

> Index: gcc/testsuite/gcc.dg/pr69650.c
> ===
> --- gcc/testsuite/gcc.dg/pr69650.c(revision 0)
> +++ gcc/testsuite/gcc.dg/pr69650.c(working copy)
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-std=gnu99" } */
> +
> +# 9 "" 2 /* { dg-error "left but not entered" } */
> +not_a_type a; /* { dg-error "unknown type" } */

Can we also have a testcase with a non-empty filename?  I'm interested
in seeing what the exact error messages looks like.

Also, is it possible to construct a testcase that triggers
the logic in the non-MAIN_FILE_P clause? (perhaps with some
# directives for a variety of "files")?


Hope this is constructive
Dave

Re: [PATCH] PR69195, Reload confused by invalid reg equivs

2016-03-11 Thread Alan Modra

On Fri, Mar 11, 2016 at 09:39:58PM +0100, Andreas Schwab wrote:
> I'm getting this crash on ia64 for gcc.c-torture/compile/pr58164.c:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x412286e0 in indirect_jump_optimize () at ../../gcc/ira.c:3865
> 3865  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));

Breakpoint 1, indirect_jump_optimize () at /src/gcc.git/gcc/ira.c:3863
(gdb) s
(gdb) p regno
$1 = 328
(gdb) p df->def_regs[328]
$2 = (df_reg_info *) 0x19772d0
(gdb) p *df->def_regs[328]
$3 = {reg_chain = 0x1981e60, n_refs = 1}
(gdb) p *df->def_regs[328]->reg_chain
$4 = {base = {cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 
328, reg = 0x76c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, 
next_reg = 0x0, prev_reg = 0x0, id = -1, ref_order = 22}, regular_ref = {base = 
{cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 328, reg = 
0x76c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, next_reg = 0x0, 
prev_reg = 0x0, id = -1, ref_order = 22}, loc = 0x76cf8068}, artificial_ref 
= {base = {cl = DF_REF_ARTIFICIAL, type = DF_REF_REG_DEF, flags = 0, regno = 
328, reg = 0x76c271c8, next_loc = 0x0, chain = 0x0, insn_info = 0x0, 
next_reg = 0x0, prev_reg = 0x0, id = -1, ref_order = 22}, bb = 0x76cf8068}}

Bootstrapping the following (diff -w).

* ira.c (indirect_jump_optimize): Ignore artificial defs.

diff --git a/gcc/ira.c b/gcc/ira.c
index 5e7a2ed..c4d76fc 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3862,7 +3862,10 @@ indirect_jump_optimize (void)
   int regno = REGNO (SET_SRC (x));
   if (DF_REG_DEF_COUNT (regno) == 1)
{
- rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (regno));
+ df_ref def = DF_REG_DEF_CHAIN (regno);
+ if (!DF_REF_IS_ARTIFICIAL (def))
+   {
+ rtx_insn *def_insn = DF_REF_INSN (def);
  rtx note = find_reg_note (def_insn, REG_LABEL_OPERAND, NULL_RTX);
 
  if (note)
@@ -3873,6 +3876,7 @@ indirect_jump_optimize (void)
}
}
}
+}
 
   if (rebuild_p)
 {

-- 
Alan Modra
Australia Development Lab, IBM

New Swedish PO file for 'gcc' (version 6.1-b20160131)

2016-03-11 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-6.1-b20160131.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[PATCH][PR tree-optimization/70190] Handle computed goto with constant address in jump threader

2016-03-11 Thread Jeff Law



This is arguably invalid code, but we certainly shouldn't be faulting in 
these kind of situations.


The FSM threader was presented with a computed goto.  It was able to 
trace values backwards through a PHI to a constant (as in a constant 
integer, not a label).


In that case find_taken_edge will return NULL and we need to gracefully 
handle it.


Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on 
the trunk momentarily.


Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f3a7351..1bc7ab5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2016-03-11  Jeff Law  
 
+   PR tree-optimization/70190
+   * tree-ssa-threadbackward.c (fsm_find_control_statement_thread_paths):
+   Handle cases where we can not extract the taken edge, even though we
+   found a constant value.
+
PR tree-optimization/64058
* tree-ssa-coalesce.c (struct coalesce_pair): Add new field INDEX.
(num_coalesce_pairs): Move up earlier in file.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f42e943..e48430c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-11  Jeff Law  
+
+   PR tree-optimization/70190
+   * gcc.c-torture/compile/pr70190.c: New test.
+
 2016-03-11  David Malcolm  
 
PR c/68187
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr70190.c 
b/gcc/testsuite/gcc.c-torture/compile/pr70190.c
new file mode 100644
index 000..d3d209a
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr70190.c
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target indirect_jumps } */
+/* { dg-require-effective-target label_values } */
+
+
+int
+fn1 ()
+{
+  static char a[] = "foo";
+  static void *b[] = { &&l1, &&l2 };
+  goto *(b[1]);
+ l1: goto *(a[0]);
+ l2: return 0;
+}
+
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 6f1b757..88f8d5e 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -385,6 +385,16 @@ fsm_find_control_statement_thread_paths (tree name,
 
 We have to know the outgoing edge to figure this out.  */
  edge taken_edge = find_taken_edge ((*path)[0], arg);
+
+ /* There are cases where we may not be able to extract the
+taken edge.  For example, a computed goto to an absolute
+address.  Handle those cases gracefully.  */
+ if (taken_edge == NULL)
+   {
+ path->pop ();
+ continue;
+   }
+
  bool creates_irreducible_loop = false;
  if (threaded_through_latch
  && loop == taken_edge->dest->loop_father

Re: LRA remat issue with hard regs (PR70123)

2016-03-11 Thread Jeff Law


On 03/10/2016 11:10 AM, Bernd Schmidt wrote:

When I submitted my previous lra-remat patch, I mentioned I had some
concerns about the way we dealt with register number comparisons, but I
didn't want to change things blindly without a testcase. PR70123 has now
provided such a testcase where we are trying to rematerialize a hard
register (r6). While scanning we encounter an instruction of the form
  (set (reg 285) (reg 272))
i.e. involving only pseudos, but reg_renumber[285] is r6. Since we only
compare register numbers, we do not notice that the hard reg is clobbered.

The following patch modifies the function input_regno_present_p, and
also renames it so that its purpose is more obvious to someone familiar
with other parts of gcc. I've made it look at reg_renumber, and also try
to deal with multi-word hard registers properly.

I'm not entirely sure this is a fully safe approach however, since I
can't yet answer the question of whether LRA could change another pseudo
to reside in hard register 6, thereby making the rematerialization
invalid after the fact. Therefore the patch also includes a change to
just disable candidates if they involve hard registers. I haven't
observed that making any difference in code generation (on x86_64),
beyond fixing the testcase on s390.

Bootstrapped and tested on x86_64-linux; Jakub verified that the
testcase works afterwards. Ok for trunk and 5-branch, either for one or
for both parts? I'm hoping the testcase in gcc.dg/torture will get
exercised in the right way on s390, but I haven't run tests on that
machine.


Bernd

remat-hardregs.diff


PR target/70123
* lra-remat.c (operand_to_remat): Disallow hard regs in the value t
be rematerialized.
(reg_overlap_for_remat_p): Renamed from input_regno_present_p.
Arguments swapped.  All callers changed.  Take reg_renumber into
account, and Calculate and compare register ranges for hard regs.

PR target/70123
* gcc.dg/torture/pr70123.c: New test.
I went ahead and committed this to the trunk to give it soak time over 
the weekend.


jeff

[PATCH], Fix PR 70131, disable (double)(int) optimization for power8

2016-03-11 Thread Michael Meissner

As I was auditing rs6000.md for power9 changes, I noticed that changes I had
made in 2010 for power7 weren't as effective with power8.

The FCTIWZ/FCTIWUZ instructions convert the scalar floating point value to a
32-bit signed/unsigned integer in bits 32-63 of the floating point or vector
register.  Unfortunately, the hardware does not guarantee that bits 0-31 are
copies of the sign, so that it can be used as a valid 64-bit integer.  There is
no conversion from 32-bit int to floating point.  This meant in the power7
days, if you wanted to round a floating point value to 32-bit integer, you
would need to do:

convert to 32-bit integer
store 32-bit value on the stack
load 32-bit value to a GPR
sign/zero extend it
store 32-bit value to the stack
load 32-bit value to a FPR/vector register.

The optimization does a store/load to sign/zero extend, rather than going
through the GPRs.

On power8, we have a direct move instruction that copies the value between the
register sets, and the compiler will generate this if the above optimization is
turned off (which is what this patch does).

There are other ways to sign/zero extend a value in the vector registers
without doing a move using multiple instructions, but in practice direct move
seems to be as fast as the other instructions.

I bootstrapped the compiler and there were no regressions with this patch.

I rebuilt the Spec 2006 benchmark suite, and there 7 of the benchmarks that
used this sequence somewhere in the code.  I ran those benchmarks with this
patch, and compared them to the original benchmarks.  In 6 of the benchmarks,
the run time was almost precisely the same.  The 416.gamess benchmark was about
2% faster, and there were no regressions.

Is this patch ok to apply to the trunk?  I would like to apply it to the gcc 5
branch as well.  Is this ok also?

[gcc]
2016-03-11  Michael Meissner  

PR target/70131
* config/rs6000/rs6000.md (round322_fprs): Do not do the
optimization if we have direct move.
(roundu322_fprs): Likewise.

[gcc/testsuite]
2016-03-11  Michael Meissner  

PR target/70131
* gcc.target/powerpc/ppc-round2.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 234147)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -5387,10 +5387,12 @@ (define_insn "*friz"
xsrdpiz %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Since FCTIWZ doesn't sign extend the upper bits, we have to do a store and a
-;; load to properly sign extend the value, but at least doing a store, load
-;; into a GPR to sign extend, a store from the GPR and a load back into the FPR
-;; if we have 32-bit memory ops
+;; Opitmize converting SF/DFmode to signed SImode and back to SF/DFmode.  This
+;; optimization prevents on ISA 2.06 systems and earlier having to store the
+;; value from the FPR/vector unit to the stack, load the value into a GPR, sign
+;; extend it, store it back on the stack from the GPR, load it back into the
+;; FP/vector unit to do the rounding. If we have direct move (ISA 2.07),
+;; disable using store and load to sign/zero extend the value.
 (define_insn_and_split "*round322_fprs"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d")
(float:SFDF
@@ -5399,7 +5401,7 @@ (define_insn_and_split "*round322_
(clobber (match_scratch:DI 3 "=d"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
&&  && TARGET_LFIWAX && TARGET_STFIWX && TARGET_FCFID
-   && can_create_pseudo_p ()"
+   && !TARGET_DIRECT_MOVE && can_create_pseudo_p ()"
   "#"
   ""
   [(pc)]
@@ -5431,7 +5433,7 @@ (define_insn_and_split "*roundu322
(clobber (match_scratch:DI 2 "=d"))
(clobber (match_scratch:DI 3 "=d"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && TARGET_LFIWZX && TARGET_STFIWX && TARGET_FCFIDU
+   && TARGET_LFIWZX && TARGET_STFIWX && TARGET_FCFIDU && !TARGET_DIRECT_MOVE
&& can_create_pseudo_p ()"
   "#"
   ""
Index: gcc/testsuite/gcc.target/powerpc/ppc-round2.c
===
--- gcc/testsuite/gcc.target/powerpc/ppc-round2.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ppc-round2.c   (working copy)
@@ -0,0 +1,42 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-O2 -mcpu=power8" } */
+/* { dg-final { scan-assembler-times "fcfid "  2 } } */
+/* { dg-final { scan-assembler-times "fcfids " 2 } } */
+/* { dg-final { scan-assembler-times "fctiwuz "2 } } */
+/* { dg

Re: [PATCH] PR69195, Reload confused by invalid reg equivs

2016-03-11 Thread Alan Modra

The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
an indirect jump to a random location instead of a jump to 0.

pr58164.c.035t.mergephi1
;; Function foo (foo, funcdef_no=0, decl_uid=1389, cgraph_uid=0, symbol_order=0)

foo ()
{
  int x;

  :
  x = 0;
  goto &x;

}

pr58164.c.036t.dse1
;; Function foo (foo, funcdef_no=0, decl_uid=1389, cgraph_uid=0, symbol_order=0)

  Deleted dead store 'x = 0;
'
foo ()
{
  int x;

  :
  goto &x;

}

-- 
Alan Modra
Australia Development Lab, IBM

Re: [PATCH], Fix PR 70131, disable (double)(int) optimization for power8

2016-03-11 Thread David Edelsohn

On Fri, Mar 11, 2016 at 5:41 PM, Michael Meissner
 wrote:
> As I was auditing rs6000.md for power9 changes, I noticed that changes I had
> made in 2010 for power7 weren't as effective with power8.
>
> The FCTIWZ/FCTIWUZ instructions convert the scalar floating point value to a
> 32-bit signed/unsigned integer in bits 32-63 of the floating point or vector
> register.  Unfortunately, the hardware does not guarantee that bits 0-31 are
> copies of the sign, so that it can be used as a valid 64-bit integer.  There 
> is
> no conversion from 32-bit int to floating point.  This meant in the power7
> days, if you wanted to round a floating point value to 32-bit integer, you
> would need to do:
>
> convert to 32-bit integer
> store 32-bit value on the stack
> load 32-bit value to a GPR
> sign/zero extend it
> store 32-bit value to the stack
> load 32-bit value to a FPR/vector register.
>
> The optimization does a store/load to sign/zero extend, rather than going
> through the GPRs.
>
> On power8, we have a direct move instruction that copies the value between the
> register sets, and the compiler will generate this if the above optimization 
> is
> turned off (which is what this patch does).
>
> There are other ways to sign/zero extend a value in the vector registers
> without doing a move using multiple instructions, but in practice direct move
> seems to be as fast as the other instructions.
>
> I bootstrapped the compiler and there were no regressions with this patch.
>
> I rebuilt the Spec 2006 benchmark suite, and there 7 of the benchmarks that
> used this sequence somewhere in the code.  I ran those benchmarks with this
> patch, and compared them to the original benchmarks.  In 6 of the benchmarks,
> the run time was almost precisely the same.  The 416.gamess benchmark was 
> about
> 2% faster, and there were no regressions.
>
> Is this patch ok to apply to the trunk?  I would like to apply it to the gcc 5
> branch as well.  Is this ok also?
>
> [gcc]
> 2016-03-11  Michael Meissner  
>
> PR target/70131
> * config/rs6000/rs6000.md (round322_fprs): Do not do the
> optimization if we have direct move.
> (roundu322_fprs): Likewise.
>
> [gcc/testsuite]
> 2016-03-11  Michael Meissner  
>
> PR target/70131
> * gcc.target/powerpc/ppc-round2.c: New test.

Okay for trunk and GCC 5.

Thanks, David

Re: [PATCH, match] Fix pr68714

2016-03-11 Thread Richard Henderson


On 03/02/2016 01:31 AM, Richard Biener wrote:

As a general remark I think handling of this simplification is
better done in the reassoc pass (see Jakubs comment #4) given
|| and && associate.  So I'd rather go down that route if possible.


This seems to do the trick.


r~

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68714.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr68714.c
new file mode 100644
index 000..741d311
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68714.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef int vec __attribute__((vector_size(16)));
+vec f(vec x,vec y){
+  return x (SSA_NAME_DEF_STMT (var));
+  if (stmt == NULL)
+return ERROR_MARK;
+
+  /* ??? If we start creating more COND_EXPR, we could perform
+ this same optimization with them.  For now, simplify.  */
+  if (gimple_assign_rhs_code (stmt) != VEC_COND_EXPR)
+return ERROR_MARK;
+
+  tree cond = gimple_assign_rhs1 (stmt);
+  tree_code cmp = TREE_CODE (cond);
+  if (TREE_CODE_CLASS (cmp) != tcc_comparison)
+return ERROR_MARK;
+
+  /* ??? For now, allow only canonical true and false result vectors.
+ We could expand this to other constants should the need arise,
+ but at the moment we don't create them.  */
+  tree t = gimple_assign_rhs2 (stmt);
+  tree f = gimple_assign_rhs3 (stmt);
+  bool inv;
+  if (integer_all_onesp (t))
+inv = false;
+  else if (integer_all_onesp (f))
+{
+  cmp = invert_tree_comparison (cmp, false);
+  inv = true;
+}
+  else
+return ERROR_MARK;
+  if (!integer_zerop (f))
+return ERROR_MARK;
+
+  /* Success!  */
+  if (rets)
+*rets = stmt;
+  if (reti)
+*reti = inv;
+  return cmp;
+}
+
+/* Optimize the condition of VEC_COND_EXPRs which have been combined
+   with OPCODE (either BIT_AND_EXPR or BIT_IOR_EXPR).  */
+
+static bool
+optimize_vec_cond_expr (tree_code opcode, vec *ops)
+{
+  unsigned int length = ops->length (), i, j;
+  bool any_changes = false;
+
+  if (length == 1)
+return false;
+
+  for (i = 0; i < length; ++i)
+{
+  tree elt0 = (*ops)[i]->op;
+
+  gassign *stmt0;
+  bool invert;
+  tree_code cmp0 = ovce_extract_ops (elt0, &stmt0, &invert);
+  if (cmp0 == ERROR_MARK)
+   continue;
+
+  for (j = i + 1; j < length; ++j)
+   {
+ tree &elt1 = (*ops)[j]->op;
+
+ gassign *stmt1;
+  tree_code cmp1 = ovce_extract_ops (elt1, &stmt1, NULL);
+  if (cmp1 == ERROR_MARK)
+   continue;
+
+  tree cond0 = gimple_assign_rhs1 (stmt0);
+ tree x0 = TREE_OPERAND (cond0, 0);
+ tree y0 = TREE_OPERAND (cond0, 1);
+
+  tree cond1 = gimple_assign_rhs1 (stmt1);
+ tree x1 = TREE_OPERAND (cond1, 0);
+ tree y1 = TREE_OPERAND (cond1, 1);
+
+ tree comb;
+ if (opcode == BIT_AND_EXPR)
+   comb = maybe_fold_and_comparisons (cmp0, x0, y0, cmp1, x1, y1);
+ else if (opcode == BIT_IOR_EXPR)
+   comb = maybe_fold_or_comparisons (cmp0, x0, y0, cmp1, x1, y1);
+ else
+   gcc_unreachable ();
+ if (comb == NULL)
+   continue;
+
+ /* Success! */
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Transforming ");
+ print_generic_expr (dump_file, cond0, 0);
+  fprintf (dump_file, " %c ", opcode == BIT_AND_EXPR ? '&' : '|');
+ print_generic_expr (dump_file, cond1, 0);
+  fprintf (dump_file, " into ");
+ print_generic_expr (dump_file, comb, 0);
+  fputc ('\n', dump_file);
+   }
+
+ gimple_assign_set_rhs1 (stmt0, comb);
+ if (invert)
+   std::swap (*gimple_assign_rhs2_ptr (stmt0),
+  *gimple_assign_rhs3_ptr (stmt0));
+ update_stmt (stmt0);
+
+ elt1 = error_mark_node;
+ any_changes = true;
+   }
+}
+
+  if (any_changes)
+{
+  operand_entry *oe;
+  j = 0;
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+   {
+ if (oe->op == error_mark_node)
+   continue;
+ else if (i != j)
+   (*ops)[j] = oe;
+ j++;
+   }
+  ops->truncate (j);
+}
+
+  return any_changes;
+}
+
 /* Return true if STMT is a cast like:
:
...
@@ -4326,7 +4466,7 @@ static bool
 can_reassociate_p (tree op)
 {
   tree type = TREE_TYPE (op);
-  if ((INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type))
+  if ((ANY_INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type))
   || NON_SAT_FIXED_POINT_TYPE_P (type)
   || (flag_associative_math && FLOAT_TYPE_P (type)))
 return true;
@@ -4952,6 +5092,7 @@ reassociate_bb (basic_block bb)
{
  auto_vec ops;
  tree powi_result = NULL_TREE;
+ bool is_vector = VECTOR_TYPE_P (TREE_TYPE (lhs));
 
  /* There may be no immediate uses left by the time we
 get here because we may have elimi

Re: [PATCH] PR69195, Reload confused by invalid reg equivs

2016-03-11 Thread Jakub Jelinek

On Sat, Mar 12, 2016 at 09:43:50AM +1030, Alan Modra wrote:
> The underlying problem happens somewhere in tree-ssa-dse.c.  So we get
> an indirect jump to a random location instead of a jump to 0.

Well, the testcase is there just to make sure we don't ICE on it.
And, changing just DSE can't be a complete solution, because one can use
uninitialized var from the beginning:
int
foo (void)
{
  int x;
  goto *&x;
}

Jakub

50 matches

Mail list logo