Re: [PATCH][2/2] Make SCCVN use conditional equivalences

2015-08-12 Thread Richard Biener
On Wed, 12 Aug 2015, Richard Biener wrote:

 
 This brings FRE/PRE up to the same level as DOM in being able to
 remove redundant conditionals.  It does so by inserting temporary
 conditional expressions proved to be true on single predecessor
 edges.
 
 I've had to do a lot of testcase adjustments, thus the patch is
 now re-bootstrapping / testing on x86_64-unknown-linux-gnu.

I've applied with a slight change, trimming down the number of
equivalences recorded (basically only record anything off
conditions not already optimized to go either way).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-08-12  Richard Biener  rguent...@suse.de

* tree-ssa-sccvn.c (vn_nary_op_compute_hash): Also canonicalize
comparison operand order and commutative ternary op operand order.
(sccvn_dom_walker::cond_stack): New state to track temporary
expressions.
(sccvn_dom_walker::after_dom_children): Remove tempoary expressions
no longer valid.
(sccvn_dom_walker::record_cond): Add a single temporary conditional
expression.
(sccvn_dom_walker::record_conds): Add a temporary conditional
expressions and all related expressions also true/false.
(sccvn_dom_walker::before_dom_children): Record temporary
expressions based on the controlling condition of a single
predecessor.  When trying to simplify a conditional statement
lookup expressions we might have inserted earlier.

* testsuite/gcc.dg/tree-ssa/ssa-fre-47.c: New testcase.
* testsuite/gcc.dg/tree-ssa/ssa-fre-48.c: Likewise.
* testsuite/gcc.dg/tree-ssa/ssa-fre-49.c: Likewise.
* testsuite/g++.dg/tree-ssa/pr61034.C: Adjust.
* testsuite/gcc.dg/fold-compare-2.c: Likewise.
* testsuite/gcc.dg/pr50763.c: Likewise.
* testsuite/gcc.dg/predict-3.c: Likewise.
* testsuite/gcc.dg/tree-ssa/20030709-2.c: Likewise.
* testsuite/gcc.dg/tree-ssa/pr19831-3.c: Likewise.
* testsuite/gcc.dg/tree-ssa/pr20657.c: Likewise.
* testsuite/gcc.dg/tree-ssa/pr21001.c: Likewise.
* testsuite/gcc.dg/tree-ssa/pr37508.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp04.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp07.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp09.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp16.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp20.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp25.c: Likewise.
* testsuite/gcc.dg/tree-ssa/vrp87.c: Likewise.

Index: gcc/testsuite/g++.dg/tree-ssa/pr61034.C
===
--- gcc/testsuite/g++.dg/tree-ssa/pr61034.C (revision 226802)
+++ gcc/testsuite/g++.dg/tree-ssa/pr61034.C (working copy)
@@ -43,5 +43,5 @@ bool f(I a, I b, I c, I d) {
 // This works only if everything is inlined into 'f'.
 
 // { dg-final { scan-tree-dump-times ;; Function 1 fre2 } }
-// { dg-final { scan-tree-dump-times free 18 fre2 } }
+// { dg-final { scan-tree-dump-times free 10 fre2 } }
 // { dg-final { scan-tree-dump-times unreachable 11 fre2 } }
Index: gcc/testsuite/gcc.dg/fold-compare-2.c
===
--- gcc/testsuite/gcc.dg/fold-compare-2.c   (revision 226802)
+++ gcc/testsuite/gcc.dg/fold-compare-2.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fno-tree-tail-merge -fdump-tree-vrp1 } */
+/* { dg-options -O2 -fdump-tree-fre1 } */
 
 extern void abort (void);
 
@@ -15,5 +15,5 @@ main(void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times Removing basic block 2 vrp1 } } */
+/* { dg-final { scan-tree-dump-times Removing basic block 2 fre1 } } */
 
Index: gcc/testsuite/gcc.dg/pr50763.c
===
--- gcc/testsuite/gcc.dg/pr50763.c  (revision 226802)
+++ gcc/testsuite/gcc.dg/pr50763.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -ftree-tail-merge -fno-tree-dominator-opts 
-fdump-tree-pre } */
+/* { dg-options -O2 -ftree-tail-merge -fno-tree-dominator-opts } */
 
 int bar (int i);
 
@@ -11,5 +11,3 @@ foo (int c, int d)
   d = 33;
   while (c == d);
 }
-
-/* { dg-final { scan-tree-dump-times == 33 2 pre} } */
Index: gcc/testsuite/gcc.dg/predict-3.c
===
--- gcc/testsuite/gcc.dg/predict-3.c(revision 226802)
+++ gcc/testsuite/gcc.dg/predict-3.c(working copy)
@@ -12,6 +12,10 @@ void foo (int bound)
 {
   if (i  bound - 2)
global += bar (i);
+  /* The following test is redundant with the loop bound check in the
+ for stmt and thus eliminated by FRE which makes the controlled
+stmt always executed and thus equivalent to 100%.  Thus the
+heuristic only applies three times.  */
   if (i = bound)
global += bar (i);
   if (i + 1  bound)

Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Jeff Law

On 08/11/2015 08:47 PM, Tom Tromey wrote:

Jeff It's probably time for the occasional discussion WRT dropping
Jeff gcj/libjava from the default languages and replace them with either
Jeff Ada or Go.

It's long past time to remove it.  It's only had minimal maintenance for
years now.  No one is writing new features for it or fixing bugs.  There
aren't any significant users.

I've always felt I should be the one to pull the trigger.  If this is
acceptable I can take a stab at preparing a patch.

I thought maybe this would also enable deleting boehm-gc, zlib, or
libffi; but I see now they all have other users in the tree now.
Just to be clear, I'm not talking about total removal, just removal from 
the default languages.


In the past this has stalled on issues like how will asynch-exceptions 
be tested and the like.


My inclination is to replace GCJ with Go, but Ian wasn't comfortable 
with that when I suggested it a couple years ago.  The other obvious 
alternative is Ada -- my testing showed that using Ada would further 
slow down bootstrap/regression test time which is undesirable.


I could live with either, but I'd lean towards Go.

jeff


[committed, PATCH] Add Knights Landing support to __builtin_cpu_is

2015-08-12 Thread H.J. Lu
This patch adds Knights Landing support to __builtin_cpu_is.

gcc/testsuite/

* gcc.target/i386/builtin_target.c (check_intel_cpu_model):
Check Knights Landing support.

libgcc/

* config/i386/cpuinfo.c (processor_types): Add INTEL_KNL.
(get_intel_cpu): Add Knights Landing support.
---
 gcc/testsuite/gcc.target/i386/builtin_target.c | 4 
 libgcc/config/i386/cpuinfo.c   | 5 +
 2 files changed, 9 insertions(+)

diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c 
b/gcc/testsuite/gcc.target/i386/builtin_target.c
index 4adea27..068db23 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -38,6 +38,10 @@ check_intel_cpu_model (unsigned int family, unsigned int 
model,
  /* Silvermont.  */
  assert (__builtin_cpu_is (silvermont));
  break;
+   case 0x57:
+ /* Knights Landing.  */
+ assert (__builtin_cpu_is (knl));
+ break;
case 0x1a:
case 0x1e:
case 0x1f:
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 25d85e4..b7b11de 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -56,6 +56,7 @@ enum processor_types
   AMDFAM10H,
   AMDFAM15H,
   INTEL_SILVERMONT,
+  INTEL_KNL,
   AMD_BTVER1,
   AMD_BTVER2,  
   CPU_TYPE_MAX
@@ -197,6 +198,10 @@ get_intel_cpu (unsigned int family, unsigned int model, 
unsigned int brand_id)
  /* Silvermont.  */
  __cpu_model.__cpu_type = INTEL_SILVERMONT;
  break;
+   case 0x57:
+ /* Knights Landing.  */
+ __cpu_model.__cpu_type = INTEL_KNL;
+ break;
case 0x1a:
case 0x1e:
case 0x1f:
-- 
2.4.3



Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Andrew Haley
On 12/08/15 15:44, Jeff Law wrote:
 My inclination is to replace GCJ with Go, but Ian wasn't comfortable 
 with that when I suggested it a couple years ago.

Because Go wasn't ready for prime time?

Andrew.



Re: [PATCH][1/2] Make SCCVN use conditional equivalences

2015-08-12 Thread Markus Trippelsdorf
On 2015.08.12 at 09:32 +0200, Richard Biener wrote:
 
 This is the first patch in the series to make SCCVN able to remove
 redundant comparisons by recording conditionals being true on
 visited edges.  This part of the series fully transitions the
 toplevel walk gathering entries to SCCs we value-number to the
 DOM walk which at the moment only visits last stmts of BBs.
 After this series we can safely insert expressions into the
 SCCVN tables temporarily for uses dominated by the stmt generating
 them.

This patch caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67191

-- 
Markus


[PATCH][AArch64][committed] Do not ICE after apologising for -mcmodel=large -fPIC

2015-08-12 Thread Kyrill Tkachov

Hi all,

If we try to compile a file with -mcmodel=large -fPIC we will emit a sorry in
initialize_aarch64_code_model because that
combination is not implemented. However, there is a missing break in that case
statement and we end up falling through to the gcc_unreachable and ICE'ing.
The right thing here is to break. The sorry () already suppresses generation of 
any
result so there's no risk of proceeding with wrong codegen.

Bootstrappped and tested on aarch64.

Applied as obvious with r226815.

Thanks,
Kyrill

2015-08-12  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (initialize_aarch64_code_model): Break
after -mcmodel=large -fPIC sorry.
commit 227760c90ec4316193f5a4b8a11378faedb3e039
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Mon Aug 10 16:09:14 2015 +0100

[AArch64] Do not ICE after apologising for -mcmodel=large -fPIC

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e00a069..87ed777 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7950,6 +7950,7 @@ initialize_aarch64_code_model (struct gcc_options *opts)
 	 case AARCH64_CMODEL_LARGE:
 	   sorry (code model %qs with -f%s, large,
 		  opts-x_flag_pic  1 ? PIC : pic);
+	   break;
 	 default:
 	   gcc_unreachable ();
 	 }


Re: [PATCH, PR67092, PR67098 ] Remove --with-host-libstdcxx

2015-08-12 Thread Richard Biener
On Wed, 12 Aug 2015, Tom de Vries wrote:

 Hi,
 
 this patch removes configure option --with-host-libstdcxx.
 
 [ As suggested here ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67092#c13
 ):
 ...
 I think we can no longer reliably support host libstdc++ as includes are not
 picked up from its location and GCC is C++ now.
 
 I suggest to remove that entirely.
 ...
 ]
 
 The option was originally introduced to support linking with a static version
 of ppl, but we no longer support using ppl.
 
 The behaviour of --with-host-libstdcxx is implemented in terms of other
 configure options, so we're not losing any functionality.
 
 
 Furthermore, the patch adds the missing documentation of the default behaviour
 of --with-stage1-ldflags.
 
 
 Bootstrapped and reg-tested on x86_64.
 
 OK for trunk?

Ok.

Thanks,
Richard.


Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-08-12 Thread Kyrill Tkachov


On 11/08/15 18:09, Kyrill Tkachov wrote:

On 11/08/15 18:05, Jeff Law wrote:

On 08/09/2015 03:20 PM, Steven Bosscher wrote:

On Fri, Jul 31, 2015 at 7:26 PM, Jeff Law l...@redhat.com wrote:

So there's a tight relationship between the implementation of
bbs_ok_for_cmove_arith and insn_valid_noce_process_p.  If there wasn't, then
we'd probably be looking to use note_stores and note_uses.

Perhaps I'm missing something, but what is wrong with using DF here
instead of note_stores/note_uses? All the info on refs/mods of
registers is available in the DF caches.

Nothing inherently wrong with using DF here.

I have reworked the patch to use FOR_EACH_INSN_DEF and FOR_EACH_INSN_USE
in bbs_ok_for_cmove_arith to extracts the refs/mods and it seems to work.
Is that what you mean by DF?
I'm doing some more testing and hope to post the updated version soon.


Here it is, I've used the FOR_EACH* macros from dataflow
to gather the uses and sets.

Bootstrapped and tested on x86_64 and aarch64.
How does this look?

Thanks,
Kyrill

2015-08-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
then_cost, else_cost fields.  Change branch_cost field to unsigned int.
(end_ifcvt_sequence): Call set_used_flags on each insn in the
sequence.
Include rtl-iter.h.
(noce_simple_bbs): New function.
(noce_try_move): Bail if basic blocks are not simple.
(noce_try_store_flag): Likewise.
(noce_try_store_flag_constants): Likewise.
(noce_try_addcc): Likewise.
(noce_try_store_flag_mask): Likewise.
(noce_try_cmove): Likewise.
(noce_try_minmax): Likewise.
(noce_try_abs): Likewise.
(noce_try_sign_mask): Likewise.
(noce_try_bitop): Likewise.
(bbs_ok_for_cmove_arith): New function.
(noce_emit_all_but_last): Likewise.
(noce_emit_insn): Likewise.
(noce_emit_bb): Likewise.
(noce_try_cmove_arith): Handle non-simple basic blocks.
(insn_valid_noce_process_p): New function.
(contains_mem_rtx_p): Likewise.
(bb_valid_for_noce_process_p): Likewise.
(noce_process_if_block): Allow non-simple basic blocks
where appropriate.

2015-08-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.dg/ifcvt-1.c: New test.
* gcc.dg/ifcvt-2.c: Likewise.
* gcc.dg/ifcvt-3.c: Likewise.


Thanks,
Kyrill


jeff



commit 76813a58e40d067bb3e64151617a975581bc81cb
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Jul 8 15:45:04 2015 +0100

[PATCH][ifcvt] Make non-conditional execution if-conversion more aggressive

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 1f29646..c33fe24 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -53,6 +53,7 @@
 #include tree-pass.h
 #include dbgcnt.h
 #include shrink-wrap.h
+#include rtl-iter.h
 #include ifcvt.h
 
 #ifndef HAVE_incscc
@@ -816,8 +817,17 @@ struct noce_if_info
  form as well.  */
   bool then_else_reversed;
 
+  /* True if the contents of then_bb and else_bb are a
+ simple single set instruction.  */
+  bool then_simple;
+  bool else_simple;
+
+  /* The total rtx cost of the instructions in then_bb and else_bb.  */
+  unsigned int then_cost;
+  unsigned int else_cost;
+
   /* Estimated cost of the particular branch instruction.  */
-  int branch_cost;
+  unsigned int branch_cost;
 };
 
 static rtx noce_emit_store_flag (struct noce_if_info *, rtx, int, int);
@@ -1037,6 +1047,10 @@ end_ifcvt_sequence (struct noce_if_info *if_info)
   set_used_flags (if_info-cond);
   set_used_flags (if_info-a);
   set_used_flags (if_info-b);
+
+  for (insn = seq; insn; insn = NEXT_INSN (insn))
+set_used_flags (insn);
+
   unshare_all_rtl_in_chain (seq);
   end_sequence ();
 
@@ -1054,6 +1068,21 @@ end_ifcvt_sequence (struct noce_if_info *if_info)
   return seq;
 }
 
+/* Return true iff the then and else basic block (if it exists)
+   consist of a single simple set instruction.  */
+
+static bool
+noce_simple_bbs (struct noce_if_info *if_info)
+{
+  if (!if_info-then_simple)
+return false;
+
+  if (if_info-else_bb)
+return if_info-else_simple;
+
+  return true;
+}
+
 /* Convert if (a != b) x = a; else x = b into x = a and
if (a == b) x = a; else x = b into x = b.  */
 
@@ -1068,6 +1097,9 @@ noce_try_move (struct noce_if_info *if_info)
   if (code != NE  code != EQ)
 return FALSE;
 
+  if (!noce_simple_bbs (if_info))
+return FALSE;
+
   /* This optimization isn't valid if either A or B could be a NaN
  or a signed zero.  */
   if (HONOR_NANS (if_info-x)
@@ -1116,6 +1148,9 @@ noce_try_store_flag (struct noce_if_info *if_info)
   rtx target;
   rtx_insn *seq;
 
+  if (!noce_simple_bbs (if_info))
+return FALSE;
+
   if (CONST_INT_P (if_info-b)
INTVAL (if_info-b) == STORE_FLAG_VALUE
if_info-a == const0_rtx)
@@ -1165,6 +1200,9 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   bool can_reverse;
   machine_mode mode;
 
+  if (!noce_simple_bbs (if_info))
+return FALSE;
+
   if (CONST_INT_P (if_info-a)

Re: [PATCH, PR 66521,part 2] Fix warnings on darwin when bootstrapping with vtable verification enabled

2015-08-12 Thread Jeff Law

On 08/11/2015 01:44 PM, Caroline Tice wrote:

I forgot the ChangeLog enty; here it is:

libstdc++-v3/ChangeLog:

2015-08-11  Caroline Tice cmt...@google.com

 PR 66521, Contributed by Eric Gallager
 * acinclude.m4 (VTV_CXXLINKFLAGS): Make this variable OS-specific, and
 fix the rpath flag to work properly for darwin.
 * configure: Regenerated.

OK.
jeff



Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Henderson
On 08/12/2015 01:31 AM, Segher Boessenkool wrote:
 Is there something that makes the cache not get too big?  Do we
 care, anyway?

No, nothing ever reduces the size of the cache.  I doubt we care, but I haven't
instrumented to see how big it grows.

My intuition says the most important thing about managing this cache is not to
put the most common trivial constants in, and I already do that.

 I did attempt to fix it, and got nothing for my troubles except
 poorer code generation for AND/IOR/XOR with non-trivial constants.
 
 Could you give an example of code that isn't split early enough?

I believe the examples I was seeing was in the libdecnumber code.  I'd have to
go back and reproduce it now...

 Perhaps there's some other predication or costing error that's
 getting in the way, and it simply wasn't obvious to me.   In any
 case, nothing in this patch set addresses this at all.
 
 The instruction (set (reg) (const_int 0x12345678)) is costed as 4
 (i.e. one insn).  That cannot be good.  This is alternative #5 in
 *movsi_internal1_single (there are many more variants of that
 pattern).

Yes, when I tried to fix it, I did adjust that costing, but still...

   * Constants are split *really* late.  In particular, after reload.
 
 Yeah that is bad.  But I'm still not seeing it.  Hrm, maybe only
 DImode ones?

Dunno.  I found it after I did add a recipe to use ADDI, which then triggered
an ICE due to the r0 base register.

 Have you looked at generated code quality?

I've looked at diffs of all of the object files in the target directory.  There
were definite spots of improvement.  I wasn't able to spot any regressions.


r~


[C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Marek Polacek
This patch fixes a defect in -Wshift-overflow.  We should only warn
about left-shifting 1 into the sign bit when -Wshift-overflow=2.  But
this doesn't apply only for 1  31, but also for 2  30, etc.
In C++14, never warn about this.

Neither existing tests nor documentation require updating, I think.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-08-12  Marek Polacek  pola...@redhat.com

PR c++/55095
* c-common.c (maybe_warn_shift_overflow): Properly handle
left-shifting 1 into the sign bit.

* c-c++-common/Wshift-overflow-6.c: New test.
* c-c++-common/Wshift-overflow-7.c: New test.
* g++.dg/cpp1y/left-shift-2.C: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index f6c5ddd..13175d8 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -12442,9 +12442,10 @@ maybe_warn_shift_overflow (location_t loc, tree op0, 
tree op1)
   if (TYPE_UNSIGNED (type0))
 return false;
 
+  unsigned int min_prec = (wi::min_precision (op0, SIGNED)
+  + TREE_INT_CST_LOW (op1));
   /* Handle the left-shifting 1 into the sign bit case.  */
-  if (integer_onep (op0)
-   compare_tree_int (op1, prec0 - 1) == 0)
+  if (min_prec == prec0 + 1)
 {
   /* Never warn for C++14 onwards.  */
   if (cxx_dialect = cxx14)
@@ -12456,8 +12457,6 @@ maybe_warn_shift_overflow (location_t loc, tree op0, 
tree op1)
return true;
 }
 
-  unsigned int min_prec = (wi::min_precision (op0, SIGNED)
-  + TREE_INT_CST_LOW (op1));
   bool overflowed = min_prec  prec0;
   if (overflowed  c_inhibit_evaluation_warnings == 0)
 warning_at (loc, OPT_Wshift_overflow_,
diff --git gcc/testsuite/c-c++-common/Wshift-overflow-6.c 
gcc/testsuite/c-c++-common/Wshift-overflow-6.c
index e69de29..fed79f8 100644
--- gcc/testsuite/c-c++-common/Wshift-overflow-6.c
+++ gcc/testsuite/c-c++-common/Wshift-overflow-6.c
@@ -0,0 +1,36 @@
+/* PR c++/55095 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options -Wshift-overflow=1 } */
+/* { dg-additional-options -std=c++11 { target c++ } } */
+
+int i00 = 0b1  31;
+int i01 = 0b10  30;
+int i02 = 0b100  29;
+int i03 = 0b1000  28;
+int i04 = 0b1  27;
+int i05 = 0b10  26;
+int i06 = 0b100  25;
+int i07 = 0b1000  24;
+int i08 = 0b1  23;
+int i09 = 0b10  22;
+int i10 = 0b100  21;
+int i11 = 0b1000  20;
+int i12 = 0b1  19;
+int i13 = 0b10  18;
+int i14 = 0b100  17;
+int i15 = 0b1000  16;
+int i16 = 0b1  15;
+int i17 = 0b10  14;
+int i18 = 0b100  13;
+int i19 = 0b1000  12;
+int i20 = 0b1  11;
+int i21 = 0b10  10;
+int i22 = 0b100  9;
+int i23 = 0b1000  8;
+int i24 = 0b1  7;
+int i25 = 0b10  6;
+int i26 = 0b100  5;
+int i27 = 0b1000  4;
+int i28 = 0b1  3;
+int i29 = 0b10  2;
+int i30 = 0b100  1;
diff --git gcc/testsuite/c-c++-common/Wshift-overflow-7.c 
gcc/testsuite/c-c++-common/Wshift-overflow-7.c
index e69de29..0eb1fef 100644
--- gcc/testsuite/c-c++-common/Wshift-overflow-7.c
+++ gcc/testsuite/c-c++-common/Wshift-overflow-7.c
@@ -0,0 +1,36 @@
+/* PR c++/55095 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options -Wshift-overflow=2 } */
+/* { dg-additional-options -std=c++11 { target c++ } } */
+
+int i00 = 0b1  31; /* { dg-warning requires 33 bits to represent } */
+int i01 = 0b10  30; /* { dg-warning requires 33 bits to represent } */
+int i02 = 0b100  29; /* { dg-warning requires 33 bits to represent } */
+int i03 = 0b1000  28; /* { dg-warning requires 33 bits to represent } */
+int i04 = 0b1  27; /* { dg-warning requires 33 bits to represent } */
+int i05 = 0b10  26; /* { dg-warning requires 33 bits to represent } */
+int i06 = 0b100  25; /* { dg-warning requires 33 bits to represent } */
+int i07 = 0b1000  24; /* { dg-warning requires 33 bits to represent } 
*/
+int i08 = 0b1  23; /* { dg-warning requires 33 bits to represent } 
*/
+int i09 = 0b10  22; /* { dg-warning requires 33 bits to represent 
} */
+int i10 = 0b100  21; /* { dg-warning requires 33 bits to represent 
} */
+int i11 = 0b1000  20; /* { dg-warning requires 33 bits to 
represent } */
+int i12 = 0b1  19; /* { dg-warning requires 33 bits to 
represent } */
+int i13 = 0b10  18; /* { dg-warning requires 33 bits to 
represent } */
+int i14 = 0b100  17; /* { dg-warning requires 33 bits to 
represent } */
+int i15 = 0b1000  16; /* { dg-warning requires 33 bits to 
represent } */
+int i16 = 0b1  15; /* { dg-warning requires 33 bits to 
represent } */
+int i17 = 

Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Henderson
On 08/12/2015 01:32 AM, Richard Earnshaw wrote:
 How do we clear the cache, and when?  For example, on ARM, switching
 between ARM and Thumb state means we need to generate potentially
 radically different sequences?  We can do such splitting at function
 boundaries now.

At present I never clear the cache.  Maybe we'll find that's a mistake.

For arm vs thumb I would start with just using two different caches.  The way
the code is structured currently, that would mean two different classes.  Which
could just be trivial wrappers around a common base class containing the
generator code.

 Can we generate different sequences for hot/cold code within a single
 function?

Not without using different caches.

 Can we cache sequences with the context (eg use with AND, OR, ADD, etc)?

No.  At least not without...


r~


Re: [gomp4] dimension API

2015-08-12 Thread Nathan Sidwell

On 08/12/15 09:30, Nathan Sidwell wrote:

I've committed this patch to gomp4.  It reworks the oacc functiuon attribute
dimension handling.  Rather than pass the TREE_LIST to the various hooks, I
convert it to a regular C array of ints.  That makes life simpler for the
consumers.  They return a 'changed' flag to indicate whether the attrribute
should be rewritten.  That rewriting is done in a way that doesn;t presume the
attribute is unshared (Cesar, your workaround should no longer be necessary).


I've discovered I'd committed a slightly earlier version of what I'd tested. 
Consequently there is some breakage.  I'll be fixing it later today, after a retest.


nathan



[PATCH] [graphite] Constrain only on INTEGER_TYPE

2015-08-12 Thread Aditya Kumar
Passes bootstrap, no regressions.

With this patch gcc bootstraps with graphite.
make BOOT_CFLAGS=-g -O2 -fgraphite-identity -floop-interchange -floop-block

gcc/ChangeLog:

2015-08-12  Aditya Kumar  hiradi...@msn.com

* graphite-scop-detection.c (stmt_simple_for_scop_p):
Constrain only on INTEGER_TYPE

---
 gcc/graphite-scop-detection.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index fb7247e..05cb5b9 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -410,7 +410,7 @@ stmt_simple_for_scop_p (basic_block scop_entry, loop_p 
outermost_loop,
tree op = gimple_op (stmt, i);
if (!graphite_can_represent_expr (scop_entry, loop, op)
/* We can not handle REAL_TYPE. Failed for pr39260.  */
-   || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE)
+   || (TREE_CODE (TREE_TYPE (op)) != INTEGER_TYPE))
  {
if (dump_file  (dump_flags  TDF_DETAILS))
  {
-- 
2.1.0.243.g30d45f7



Re: [Aarch64][target/PR 67143] Use correct constraints on operands for atomic operations.

2015-08-12 Thread Joseph Myers
On Tue, 11 Aug 2015, Matthew Wahab wrote:

   PR target/67143
   * gcc.target/aarch64/pr67143.c: New

What's architecture-specific about this test?  That is, why doesn't it 
just go in gcc.c-torture/compile (no dg- directives needed, automatically 
looped over optimization options)?

Architecture-specific test directories should only be for tests that are 
actually architecture-specific (e.g. using architecture-specific options, 
or asm, or built-in functions, or scan-assembler for particular output), 
not for tests that can be built generically for all architectures and 
still test for the original bug.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread Trevor Saunders
On Wed, Aug 12, 2015 at 01:03:36PM -0600, Jeff Law wrote:
 On 08/12/2015 12:57 PM, Richard Sandiford wrote:
 Jeff Law l...@redhat.com writes:
 On 08/10/2015 06:05 AM, tbsaunde+...@tbsaunde.org wrote:
 From: Trevor Saunders tbsaunde+...@tbsaunde.org
 
 Hi,
 
 In many places gcc puts classes in the anon namespace so the compiler can 
 tell
 they do not get inheritted from to enable better devirtualization.  However
 debugging code in the anon namespace can be a pain, and the same thing can 
 be
 accomplished more directly by marking the classes as final.  When
 bootstrapping
 stage 3 should always be built in C++14 mode now, and of course will 
 always be
 newer than gcc 4.7, so these classes will always be marked as final there.
 AIUI cross compilers are supposed to be built with recent gcc, which I 
 would
 tend to think implies newer than 4.7, so they should also be built with 
 these
 classes marked as final.  I believe that means in all important cases
 this works just as well as the anon namespace.
 
 bootstrapped + regtested on x86_64-linux-gnu, ok?
 
 Trev
 
 
 gcc/ChangeLog:
 
 2015-08-10  Trevor Saunders  tbsau...@tbsaunde.org
 
* compare-elim.c, dce.c, dse.c, gimple-ssa-isolate-paths.c,
gimple-ssa-strength-reduction.c, graphite.c, init-regs.c,
ipa-pure-const.c, ipa-visibility.c, ipa.c, mode-switching.c,
omp-low.c, reorg.c, sanopt.c, trans-mem.c, tree-eh.c,
tree-if-conv.c, tree-ssa-copyrename.c, tree-ssa-dce.c,
tree-ssa-dom.c, tree-ssa-dse.c, tree-ssa-forwprop.c,
tree-ssa-sink.c, tree-ssanames.c, tree-stdarg.c, tree-tailcall.c,
tree-vect-generic.c, tree.c, ubsan.c, var-tracking.c,
vtable-verify.c, web.c: Use GCC_FINAL instead of the anonymous
 namespace.
 OK.
 
 I was hoping someone else was going to speak up since I seem
 to have been posting a few negative messages recently, but I think
 this is really a step in the wrong direction.  I think the code
 was using anonymous namespaces in exactly the way they were

are they actually all that common?  I think gcc is the only C++ with
which I'm familiar that uses them much.

 intended to be used.
 No need to worry about seeming to be negative.
 
 
 The problem is you can't get to stuff in the anonymous namespace easily in
 the debugger.  There was talk of fixing that, but I don't think it ever went
 forward on the gdb side.
 
 If gdb were to get fixed so that debugging this stuff was easier, then I'd
 fully support putting things back into the anonymous namespace.

I have vague memories of other reasons to dislike the anon namespace
being brought up when it was discussed at $day job, but debuggers was
the big reason (and its not just gdb that's bad)

Trev

 
 jeff
 


Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread Markus Trippelsdorf
On 2015.08.12 at 13:03 -0600, Jeff Law wrote:
 On 08/12/2015 12:57 PM, Richard Sandiford wrote:
  Jeff Law l...@redhat.com writes:
  On 08/10/2015 06:05 AM, tbsaunde+...@tbsaunde.org wrote:
  From: Trevor Saunders tbsaunde+...@tbsaunde.org
 
  Hi,
 
  In many places gcc puts classes in the anon namespace so the compiler can 
  tell
  they do not get inheritted from to enable better devirtualization.  
  However
  debugging code in the anon namespace can be a pain, and the same thing 
  can be
  accomplished more directly by marking the classes as final.  When
  bootstrapping
  stage 3 should always be built in C++14 mode now, and of course will 
  always be
  newer than gcc 4.7, so these classes will always be marked as final there.
  AIUI cross compilers are supposed to be built with recent gcc, which I 
  would
  tend to think implies newer than 4.7, so they should also be built with 
  these
  classes marked as final.  I believe that means in all important cases
  this works just as well as the anon namespace.
 
  bootstrapped + regtested on x86_64-linux-gnu, ok?
 
  Trev
 
 
  gcc/ChangeLog:
 
  2015-08-10  Trevor Saunders  tbsau...@tbsaunde.org
 
* compare-elim.c, dce.c, dse.c, gimple-ssa-isolate-paths.c,
gimple-ssa-strength-reduction.c, graphite.c, init-regs.c,
ipa-pure-const.c, ipa-visibility.c, ipa.c, mode-switching.c,
omp-low.c, reorg.c, sanopt.c, trans-mem.c, tree-eh.c,
tree-if-conv.c, tree-ssa-copyrename.c, tree-ssa-dce.c,
tree-ssa-dom.c, tree-ssa-dse.c, tree-ssa-forwprop.c,
tree-ssa-sink.c, tree-ssanames.c, tree-stdarg.c, tree-tailcall.c,
tree-vect-generic.c, tree.c, ubsan.c, var-tracking.c,
vtable-verify.c, web.c: Use GCC_FINAL instead of the anonymous
  namespace.
  OK.
 
  I was hoping someone else was going to speak up since I seem
  to have been posting a few negative messages recently, but I think
  this is really a step in the wrong direction.  I think the code
  was using anonymous namespaces in exactly the way they were
  intended to be used.
 No need to worry about seeming to be negative.
 
 
 The problem is you can't get to stuff in the anonymous namespace easily 
 in the debugger.  There was talk of fixing that, but I don't think it 
 ever went forward on the gdb side.
 
 If gdb were to get fixed so that debugging this stuff was easier, then 
 I'd fully support putting things back into the anonymous namespace.

For the record here the gdb bug in question:
https://sourceware.org/bugzilla/show_bug.cgi?id=16874
It even has a patch attached, that improves the issue.

-- 
Markus


Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread Trevor Saunders
On Wed, Aug 12, 2015 at 03:31:35PM -0400, David Edelsohn wrote:
 This patch has caused a bootstrap failure on AIX.

reverted, though my bet is this is some sort of generic devirt bug on
aix.

Trev

 
 - David


RE: C++ Concepts available in trunk?

2015-08-12 Thread Dijk, J. van
 From: Jonathan Wakely [jwakely@gmail.com]
Sent: 12 August 2015 17:41
On 12 August 2015 at 16:20, Dijk, J. van wrote:
 Dear all,

 Looking into gcc/cp/Changelog, it appears that on 2015-08-06 support
 for the C++ Concepts TS was added to trunk.

 Have Concepts in gcc matured to the point that end-users can
 start to explore and test them? If so, the page 
 https://gcc.gnu.org/projects/cxx1y.html
 needs an update, since it mentions the c++1z Concepts
 Lite *branch* at the end of the document.

Yes, the branch is not in use any longer, all support is on the trunk.

 There is also no explanation how to enable experimental
 c++1z features that I can see on that page (like there is for
 c++14 at the top of the page). I suppose a new section on
 'C++1z language features' is due?

 Well it should be a separate page really, since that page is about
 C++14 not C++17.

 To enable the features just use the appropriate -std option, as
 documented in the manual:
 https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html

Dear Jonathan,

Thanks for the clarification. I hope the attached patch against current
wwwdocs will be of use (only the name of the feature test macro is missing
in the new file cxx1z.html;  did not manage to find that.)

Regards, Jan.
diff -urN --exclude 'Entries*' wwwdocs-orig/htdocs/gcc-6/changes.html wwwdocs/htdocs/gcc-6/changes.html
--- wwwdocs-orig/htdocs/gcc-6/changes.html	2015-08-13 00:25:44.009826451 +0200
+++ wwwdocs/htdocs/gcc-6/changes.html	2015-08-13 00:15:36.507550280 +0200
@@ -68,6 +68,8 @@
 h3 id=cxxC++/h3
   ul
 liThe default mode has been changed to code-std=gnu++14/code./li
+li C++ concepts a href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4377.pdf;
+	(N4377)/a are now supported when c++1z code generation is requested./li
   /ul
 
 h3 id=fortranFortran/h3
diff -urN --exclude 'Entries*' wwwdocs-orig/htdocs/projects/cxx1y.html wwwdocs/htdocs/projects/cxx1y.html
--- wwwdocs-orig/htdocs/projects/cxx1y.html	2015-08-13 00:25:44.010826466 +0200
+++ wwwdocs/htdocs/projects/cxx1y.html	2015-08-13 00:07:54.726496707 +0200
@@ -174,6 +174,13 @@
 
   a name=concepts/ah3C++1z Concepts Branch/h3
 
+pstrongUpdate/strong: Support for the
+a href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4377.pdf;
+C++ Concepts Technical Specification/a
+was merged into trunk at 6. August 2015 (revision r226713) and will
+be available in gcc-6. Users of the concepts branch are urged to switch
+to trunk instead./p
+
 pConcepts was a major feature planned for the C++11 standard, but it was
 eventually dropped due to concerns about both the description and
 implementability.  Since the publication of C++11, people have been working
@@ -183,6 +190,7 @@
 Specification in 2015.  The initial implementation is available from the
 link above, and it is in the process of being cleaned up and moved into
 the codec++-concepts/code branch./p
+
   
 /body
 /html
diff -urN --exclude 'Entries*' wwwdocs-orig/htdocs/projects/cxx1z.html wwwdocs/htdocs/projects/cxx1z.html
--- wwwdocs-orig/htdocs/projects/cxx1z.html	1970-01-01 01:00:00.0 +0100
+++ wwwdocs/htdocs/projects/cxx1z.html	2015-08-13 00:09:21.561823268 +0200
@@ -0,0 +1,66 @@
+html
+head
+  titleC++1z Support in GCC/title
+style type=text/css
+  /* ![CDATA[*/
+tr.separator { background: #cc}
+.supported { color: green }
+.unsupported { color: red }
+  /* ]] */
+/style
+/head
+
+body
+  h1C++1z Support in GCC/h1
+
+  pGCC has experimental support for the upcoming revision of the C++
+  standard, which is expected to be published in 2017. Until we are
+  sure of that, it will be referred to as C++1z./p
+
+  pC++1z features are available as part of the mainline GCC
+compiler in the trunk of
+a href=../svn.htmlGCC's Subversion
+  repository/a. To enable C++1z support, add the command-line
+  parameter code-std=c++1z/code
+  to your codeg++/code command line. Or, to enable GNU
+  extensions in addition to C++1z extensions,
+  add code-std=gnu++1z/code to your codeg++/code command
+  line./p
+
+  pstrongImportant/strong: Because the final ISO C++1z standard is
+  still under development, GCC's support is strongexperimental/strong
+  and features may disappear or change in incompatible ways as the standard
+  matures./p
+
+h2C++1z Language Features/h2
+
+  pThe following table lists new language features that are part of
+  the C++1z standard. The Proposal column
+  provides a link to the ISO C++ committee proposal that describes the
+  feature, while the Available in GCC? column indicates the first
+  version of GCC that contains an implementation of this feature (if
+  it has been implemented)./p
+
+  table border=1
+tr class=separator
+  thLanguage Feature/th
+  thProposal/th
+  thAvailable in GCC?/th
+  thSD-6 Feature Test/th
+/tr
+tr
+  tdC++ concepts/td
+  tda 

[PATCH] Forward to correct function in std::shared_mutex::unlock().

2015-08-12 Thread Eelis

Looks like a typo. :)
Index: libstdc++-v3/include/std/shared_mutex
===
--- libstdc++-v3/include/std/shared_mutex	(revision 226840)
+++ libstdc++-v3/include/std/shared_mutex	(working copy)
@@ -331,7 +331,7 @@
 
 void lock() { _M_impl.lock(); }
 bool try_lock() { return _M_impl.try_lock(); }
-void unlock() { _M_impl.try_lock(); }
+void unlock() { _M_impl.unlock(); }
 
 // Shared ownership
 


Re: [PATCH] [graphite] Constrain only on INTEGER_TYPE

2015-08-12 Thread Tobias Grosser

On 08/12/2015 10:33 PM, Aditya Kumar wrote:

Passes bootstrap, no regressions.

With this patch gcc bootstraps with graphite.
make BOOT_CFLAGS=-g -O2 -fgraphite-identity -floop-interchange -floop-block


LGTM, but please use a longer sentence to explain what you do.

Tobias


Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Mike Stump
On Aug 12, 2015, at 1:07 PM, Richard Sandiford rdsandif...@googlemail.com 
wrote:
 I don't think the right shifts are an issue at all.
 
 Well, they're implementation-defined, at least in C.

 The C11 wording for E1  E2 is If E1 has a signed type and a negative 
 value, the
 resulting value is implementation-defined”.

 Is C++ different?

No, it is the same:

3 The value of E1  E2 is E1 right-shifted E2 bit positions.  If E1 has
  an  unsigned  type or if E1 has a signed type and a nonnegative value,
  the value of the result is the integral part of  the  quotient  of  E1
  divided  by the quantity 2 raised to the power E2.  If E1 has a signed
  type and a negative value,  the  resulting  value  is  implementation-
  defined.

 (I don't have the standard handy.)

Google is your friend.

Re: [PATCH 01/15] rs6000: Split out rs6000_is_valid_and_mask_wide

2015-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2015 at 08:50:48AM -0700, Richard Henderson wrote:
 On 08/12/2015 06:23 AM, Segher Boessenkool wrote:
  On Tue, Aug 11, 2015 at 06:11:30PM -0700, Richard Henderson wrote:
  This allows testing for a mask without having to call GEN_INT.
 
  Cc: David Edelsohn dje@gmail.com
  ---
 * config/rs6000/rs6000.c (rs6000_is_valid_mask_wide): Split out from...
 (rs6000_is_valid_mask): ... here.
 (rs6000_is_valid_and_mask_wide): Split out from...
 (rs6000_is_valid_and_mask): ... here.
  
  I don't like these _wide names much.
 
 It follows the existing practice within the backend.

One existing function name, yes.  And you are replacing that function :-)

   You could overload the shorter
  name, if you really think creating some garbage const_int's is too much
  overhead (it might well be if you use it a lot more in later patches).
 
 At one stage in the development (before I became much leaner with the search
 for rotate), it really really mattered.

For the AND patterns I considered such a search too; I didn't do it
because as you say it would have to consider a *lot* of possibilities,
most useless.  AND sequences of more than two insns often prevented
other optimisation too, so I settled on two insns maximum, and then you
can generate it directly no problem.

So yes if you call it way too often it also creates too much garbage.

  -bool
  -rs6000_is_valid_mask (rtx mask, int *b, int *e, machine_mode mode)
  +static bool
  +rs6000_is_valid_mask_wide (unsigned HOST_WIDE_INT val, int *b, int *e, 
  int n)
  
  But why change the mode parameter?  The code was clearer before.
 
 So that we don't have to look up GET_MODE_BITSIZE (mode).

Getting rid of a single array lookup matters more than interface clarity?
You must have been calling it *very* often!  Thankfully you don't anymore.

  +static bool
  +rs6000_is_valid_and_mask_wide (unsigned HOST_WIDE_INT val, machine_mode 
  mode)
   {
 int nb, ne;
   
  -  if (!rs6000_is_valid_mask (mask, nb, ne, mode))
  -return false;
  +  switch (mode)
  +{
  +case DImode:
  +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 64))
  +  return false;
  +  /* For DImode, we need a rldicl, rldicr, or a rlwinm with
  +   mask that does not wrap.  */
  +  return (ne == 0 || nb == 63 || (nb  32  ne = nb));
   
  -  /* For DImode, we need a rldicl, rldicr, or a rlwinm with mask that
  - does not wrap.  */
  -  if (mode == DImode)
  -return (ne == 0 || nb == 63 || (nb  32  ne = nb));
  +case SImode:
  +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 32))
  +  return false;
  +  /* For SImode, rlwinm can do everything.  */
  +  return (nb  32  ne  32);
   
  -  /* For SImode, rlwinm can do everything.  */
  -  if (mode == SImode)
  -return (nb  32  ne  32);
  +default:
  +  return false;
  +}
  +}
   
  -  return false;
  
  You don't need any of these changes then, either.
 
 True, not *needed* per-se, but if you look closer I'm combining conditionals.
 I think the replacement here is clearer.

You're combining a conditional that you add (for mode - 32,64), and
the code doesn't become any clearer at all IMHO.

  -  unsigned HOST_WIDE_INT val = INTVAL (c);
  +  unsigned HOST_WIDE_INT val = UINTVAL (c);
  
  Does it matter?
 
 No.

Ah okay, you were getting me worried!  :-)


Segher


Re: [PATCH 06/15] rs6000: Use rldiwi in constant construction

2015-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2015 at 08:55:51AM -0700, Richard Henderson wrote:
 On 08/12/2015 07:02 AM, Segher Boessenkool wrote:
  On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
  @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int 
  budget)
 if (exam_mask (-1, c, sub_budget))
   return true;
   
  +  /* If the two halves are equal, use an insert.  */
  +  if (c  32 == test  exam_sub (test, sub_budget))
  +{
  +  opN (VEC_DUPLICATE, 0xu); /* RLDIMI */
  +  return true;
  +}
  
  Does this work for c with the high bit set?  I think you need
  to cast it to unsigned HOST_WIDE_INT first?
 
 Indeed, a sign-extension works better.  It means the base constant will use
 LIS+ORIS without trying to create an unsigned version.

Patch 8/15 changes this so that test is assigned the sign-extended low
32 bits right before this code; that should work just fine.

 If you're talking about ubsan sort of restrictions on shifting signed
 constants...  I choose to totally ignore that.

Good plan.  We rely on arithmetic shifts rounding towards negative
infinity, and so does the rest of the world.

 Certainly no where else in gcc
 has been audited for that, beginning with hwint.h itself.

Yes.  And there are much worse problems, like many things not working
right if your HOST_WIDE_INT would happen to be more than 64 bits; we
cannot really shake those out because there is no actual system to
test that on -- but it also doesn't actually matter, because there is
no system to run it on :-)


Segher


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2015 at 03:31:48AM -0500, Segher Boessenkool wrote:
* This is the only platform for which I bothered collecting any sort
  of performance data:
  
  As best I can tell, there is a 9% improvement in bootstrap speed
  for ppc64.  That is, 10 minutes off the original 109 minute build.
 
 That is, wow.  Wow :-)

Bootstrap + 4 regtests of a virgin trunk du jour took me 127m;
with your patches, 130m.  So I'm not seeing that improvement (but
no regression either).  This is gcc110 fwiw.


Segher


Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread David Malcolm
On Thu, 2015-08-13 at 01:58 +0200, Markus Trippelsdorf wrote:
 On 2015.08.12 at 13:03 -0600, Jeff Law wrote:
  On 08/12/2015 12:57 PM, Richard Sandiford wrote:
   Jeff Law l...@redhat.com writes:
   On 08/10/2015 06:05 AM, tbsaunde+...@tbsaunde.org wrote:
   From: Trevor Saunders tbsaunde+...@tbsaunde.org
  
   Hi,
  
   In many places gcc puts classes in the anon namespace so the compiler 
   can tell
   they do not get inheritted from to enable better devirtualization.  
   However
   debugging code in the anon namespace can be a pain, and the same thing 
   can be
   accomplished more directly by marking the classes as final.  When
   bootstrapping
   stage 3 should always be built in C++14 mode now, and of course will 
   always be
   newer than gcc 4.7, so these classes will always be marked as final 
   there.
   AIUI cross compilers are supposed to be built with recent gcc, which I 
   would
   tend to think implies newer than 4.7, so they should also be built with 
   these
   classes marked as final.  I believe that means in all important cases
   this works just as well as the anon namespace.
  
   bootstrapped + regtested on x86_64-linux-gnu, ok?
  
   Trev
  
  
   gcc/ChangeLog:
  
   2015-08-10  Trevor Saunders  tbsau...@tbsaunde.org
  
   * compare-elim.c, dce.c, dse.c, gimple-ssa-isolate-paths.c,
   gimple-ssa-strength-reduction.c, graphite.c, init-regs.c,
   ipa-pure-const.c, ipa-visibility.c, ipa.c, mode-switching.c,
   omp-low.c, reorg.c, sanopt.c, trans-mem.c, tree-eh.c,
   tree-if-conv.c, tree-ssa-copyrename.c, tree-ssa-dce.c,
   tree-ssa-dom.c, tree-ssa-dse.c, tree-ssa-forwprop.c,
   tree-ssa-sink.c, tree-ssanames.c, tree-stdarg.c, 
   tree-tailcall.c,
   tree-vect-generic.c, tree.c, ubsan.c, var-tracking.c,
   vtable-verify.c, web.c: Use GCC_FINAL instead of the anonymous
   namespace.
   OK.
  
   I was hoping someone else was going to speak up since I seem
   to have been posting a few negative messages recently, but I think
   this is really a step in the wrong direction.  I think the code
   was using anonymous namespaces in exactly the way they were
   intended to be used.
  No need to worry about seeming to be negative.
  
  
  The problem is you can't get to stuff in the anonymous namespace easily 
  in the debugger.  There was talk of fixing that, but I don't think it 
  ever went forward on the gdb side.
  
  If gdb were to get fixed so that debugging this stuff was easier, then 
  I'd fully support putting things back into the anonymous namespace.
 
 For the record here the gdb bug in question:
 https://sourceware.org/bugzilla/show_bug.cgi?id=16874
 It even has a patch attached, that improves the issue.

Sadly that seems to have stalled.

FWIW, our gcc/gdbhooks.py adds in a break-on-pass gdb command to work
around this:
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02011.html

  (gdb) break-on-pass pass_final
  Breakpoint 6 at 0x8396ba: file ../../src/gcc/final.c, line 4526.
  (gdb) cont
  Continuing.
  Breakpoint 6, (anonymous namespace)::pass_final::execute (this=0x17fb990) at 
../../src/gcc/final.c:4526
  4526virtual unsigned int execute (function *) { return 
rest_of_handle_final (); }
  (gdb)

Though clearly hacking around it this way is suboptimal.

Dave



Go patch committed: Ignore erroneous trees in flatten

2015-08-12 Thread Ian Lance Taylor
This patch by Chris Manghane ignores erroneous trees in the flatten
pass.  This avoids various compiler crashes on invalid input, and
fixes https://golang.org/issue/11536 , https://golang.org/issue/11558
, and https://golang.org/issue/11559 .  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 226825)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5fc38e74d132cd6f4e7b56e6bcf9fe57031ab203
+fc9da313b4f5c13b4ac3b98e699fd1c89613
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 226596)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -3101,6 +3101,12 @@ Expression*
 Type_conversion_expression::do_flatten(Gogo*, Named_object*,
Statement_inserter* inserter)
 {
+  if (this-type()-is_error_type() || this-expr_-is_error_expression())
+{
+  go_assert(saw_errors());
+  return Expression::make_error(this-location());
+}
+
   if (((this-type()-is_string_type()
  this-expr_-type()-is_slice_type())
|| this-expr_-type()-interface_type() != NULL)
@@ -3585,8 +3591,13 @@ Expression*
 Unary_expression::do_flatten(Gogo* gogo, Named_object*,
  Statement_inserter* inserter)
 {
-  if (this-is_error_expression() || this-expr_-is_error_expression())
-return Expression::make_error(this-location());
+  if (this-is_error_expression()
+  || this-expr_-is_error_expression()
+  || this-expr_-type()-is_error_type())
+{
+  go_assert(saw_errors());
+  return Expression::make_error(this-location());
+}
 
   Location location = this-location();
   if (this-op_ == OPERATOR_MULT
@@ -5062,10 +5073,16 @@ Expression*
 Binary_expression::do_flatten(Gogo* gogo, Named_object*,
   Statement_inserter* inserter)
 {
-  if (this-classification() == EXPRESSION_ERROR)
-return this;
-
   Location loc = this-location();
+  if (this-left_-type()-is_error_type()
+  || this-right_-type()-is_error_type()
+  || this-left_-is_error_expression()
+  || this-right_-is_error_expression())
+{
+  go_assert(saw_errors());
+  return Expression::make_error(loc);
+}
+
   Temporary_statement* temp;
   if (this-left_-type()-is_string_type()
this-op_ == OPERATOR_PLUS)
@@ -6806,6 +6823,11 @@ Builtin_call_expression::do_flatten(Gogo
 Statement_inserter* inserter)
 {
   Location loc = this-location();
+  if (this-is_erroneous_call())
+{
+  go_assert(saw_errors());
+  return Expression::make_error(loc);
+}
 
   switch (this-code_)
 {
@@ -8733,8 +8755,11 @@ Expression*
 Call_expression::do_flatten(Gogo* gogo, Named_object*,
Statement_inserter* inserter)
 {
-  if (this-classification() == EXPRESSION_ERROR)
-return this;
+  if (this-is_erroneous_call())
+{
+  go_assert(saw_errors());
+  return Expression::make_error(this-location());
+}
 
   if (this-is_flattened_)
 return this;
@@ -8902,6 +8927,27 @@ Call_expression::issue_error()
 }
 }
 
+// Whether or not this call contains errors, either in the call or the
+// arguments to the call.
+
+bool
+Call_expression::is_erroneous_call()
+{
+  if (this-is_error_expression() || this-fn()-is_error_expression())
+return true;
+
+  if (this-args() == NULL)
+return false;
+  for (Expression_list::iterator pa = this-args()-begin();
+   pa != this-args()-end();
+   ++pa)
+{
+  if ((*pa)-type()-is_error_type() || (*pa)-is_error_expression())
+return true;
+}
+  return false;
+}
+
 // Get the type.
 
 Type*
@@ -9848,30 +9894,47 @@ Array_index_expression::do_flatten(Gogo*
Statement_inserter* inserter)
 {
   Location loc = this-location();
+  Expression* array = this-array_;
+  Expression* start = this-start_;
+  Expression* end = this-end_;
+  Expression* cap = this-cap_;
+  if (array-is_error_expression()
+  || array-type()-is_error_type()
+  || start-is_error_expression()
+  || start-type()-is_error_type()
+  || (end != NULL
+   (end-is_error_expression() || end-type()-is_error_type()))
+  || (cap != NULL
+   (cap-is_error_expression() || cap-type()-is_error_type(
+{
+  go_assert(saw_errors());
+  return Expression::make_error(loc);
+}
+
   Temporary_statement* temp;
-  if (this-array_-type()-is_slice_type()  !this-array_-is_variable())
+  if (array-type()-is_slice_type()  !array-is_variable())
 {
-  temp = Statement::make_temporary(NULL, this-array_, loc);
+  temp = 

Re: [RFC] COMDAT Safe Module Level Multi versioning

2015-08-12 Thread Sriraman Tallam
On Tue, Aug 4, 2015 at 11:43 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Jun 16, 2015 at 4:22 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, May 19, 2015 at 9:11 AM, Xinliang David Li davi...@google.com 
 wrote:

 Hm.  But which options are unsafe?  Also wouldn't it be better to simply
 _not_ have unsafe options produce comdats but always make local clones
 for them (thus emit the comdat with unsafe flags dropped)?

 Always localize comdat functions may lead to text size increase. It
 does not work if the comdat function is a virtual function for
 instance.

 Based on Richard's suggestion, I have a patch to localize comdat
 functions which seems like a very effective solution to this problem.
 The text size increase is limited to the extra comdat copies generated
 for the specialized modules (modules with unsafe options) which is
 usually only a few.   Since -fweak does something similar for
 functions,  I have called the new option -fweak-comdat-functions.
 This does not apply to virtual comdat functions as their addresses can
 always be leaked via the vtable. Using this flag with virtual
 functions generates a warning.

 To summarize, this is the intended usage of this option. Modules which
 use unsafe code options, like -misa for multiversioning, to generate
 code that is meant to run only on a subset of CPUs can generate
 comdats with specialized instructions which when picked by the linker
 can get run unconditionally causing SIGILL on unsupported platforms.
 This flag hides these comdats to be local to these modules and not
 make them available publicly,  with the caveat that it does not apply
 to virtual comdats.

 Could you please review?

 Ping.  This patch uses Richard's suggestion to localize comdat
 functions with option -fno-weak-comdat-functions.  Comments?


Ping.

* c-family/c.opt (fweak-comdat-functions): New option.
* cp/decl2.c (comdat_linkage): Implement new option.  Warn when
virtual comdat functions are seen.
* doc/invoke.texi: Document new option.
* testsuite/g++.dg/no-weak-comdat-functions-1.C: New test.


 * c-family/c.opt (fweak-comdat-functions): New option.
 * cp/decl2.c (comdat_linkage): Implement new option.  Warn when
 virtual comdat functions are seen.
 * doc/invoke.texi: Document new option.
 * testsuite/g++.dg/no-weak-comdat-functions-1.C: New test.



 * c-family/c.opt (fweak-comdat-functions): New option.
 * cp/decl2.c (comdat_linkage): Implement new option.  Warn when
 virtual comdat functions are seen.
 * doc/invoke.texi: Document new option.
 * testsuite/g++.dg/no-weak-comdat-functions-1.C: New test.


 Thanks
 Sri



 David



 Richard.


 Thanks
 Sri
* c-family/c.opt (fweak-comdat-functions): New option.
* cp/decl2.c (comdat_linkage): Implement new option.  Warn when
virtual comdat functions are seen.
* doc/invoke.texi: Document new option.
* testsuite/g++.dg/no-weak-comdat-functions-1.C: New test.

Index: c-family/c.opt
===
--- c-family/c.opt  (revision 224486)
+++ c-family/c.opt  (working copy)
@@ -1236,6 +1236,14 @@ fweak
 C++ ObjC++ Var(flag_weak) Init(1)
 Emit common-like symbols as weak symbols
 
+fweak-comdat-functions
+C++ Var(flag_weak_comdat_functions) Init(1)
+Specific to comdat functions(-fno-weak-comdat-functions : Localize Comdat 
Functions).
+With -fno-weak-comdat-functions, virtual comdat functions are still linked as
+weak functions.  With -fno-weak-comdat-functions, the address of the comdat
+functions that are localized will be unique and this can cause unintended
+behavior when addresses of comdat functions are used.
+
 fwide-exec-charset=
 C ObjC C++ ObjC++ Joined RejectNegative
 -fwide-exec-charset=cset Convert all wide strings and character 
constants to character set cset
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 224486)
+++ cp/decl2.c  (working copy)
@@ -1702,8 +1702,19 @@ mark_vtable_entries (tree decl)
 void
 comdat_linkage (tree decl)
 {
-  if (flag_weak)
-make_decl_one_only (decl, cxx_comdat_group (decl));
+  if (flag_weak
+   (flag_weak_comdat_functions
+ || TREE_CODE (decl) != FUNCTION_DECL
+ || DECL_VIRTUAL_P (decl)))
+{
+  make_decl_one_only (decl, cxx_comdat_group (decl));
+  if (TREE_CODE (decl) == FUNCTION_DECL
+  DECL_VIRTUAL_P (decl)
+  !flag_weak_comdat_functions)
+   warning_at (DECL_SOURCE_LOCATION (decl), 0,
+   fno-weak-comdat-functions: Comdat linkage of virtual 
+   function %q#D preserved.);
+}
   else if (TREE_CODE (decl) == FUNCTION_DECL
   || (VAR_P (decl)  DECL_ARTIFICIAL (decl)))
 /* We can just emit function and compiler-generated variables
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 224486)
+++ doc/invoke.texi (working copy)

Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2015 at 08:32:46AM -0700, Richard Henderson wrote:
 On 08/12/2015 01:31 AM, Segher Boessenkool wrote:
  Is there something that makes the cache not get too big?  Do we
  care, anyway?
 
 No, nothing ever reduces the size of the cache.  I doubt we care, but I 
 haven't
 instrumented to see how big it grows.
 
 My intuition says the most important thing about managing this cache is not to
 put the most common trivial constants in, and I already do that.

Right.  And it seems to cache negative results too (the five-insn
sequence).

  I did attempt to fix it, and got nothing for my troubles except
  poorer code generation for AND/IOR/XOR with non-trivial constants.
  
  Could you give an example of code that isn't split early enough?
 
 I believe the examples I was seeing was in the libdecnumber code.  I'd have to
 go back and reproduce it now...

If you could, please do.

  Perhaps there's some other predication or costing error that's
  getting in the way, and it simply wasn't obvious to me.   In any
  case, nothing in this patch set addresses this at all.
  
  The instruction (set (reg) (const_int 0x12345678)) is costed as 4
  (i.e. one insn).  That cannot be good.  This is alternative #5 in
  *movsi_internal1_single (there are many more variants of that
  pattern).
 
 Yes, when I tried to fix it, I did adjust that costing, but still...

I misread it (it's alt #6, with cost 8).  Maybe Alan's patches would
fix this one?

* Constants are split *really* late.  In particular, after reload.
  
  Yeah that is bad.  But I'm still not seeing it.  Hrm, maybe only
  DImode ones?
 
 Dunno.  I found it after I did add a recipe to use ADDI, which then triggered
 an ICE due to the r0 base register.

Ah!  Reload generates *new* addition insns out of thin air, and that
is exactly the case where ADDI won't work.  LRA works a bit better
there it seems (but it is still not the default for rs6000); if the
constraint (b in this case) would not match, it tries other things.

  Have you looked at generated code quality?
 
 I've looked at diffs of all of the object files in the target directory.  
 There
 were definite spots of improvement.  I wasn't able to spot any regressions.

[ I've looked now. ]

Some of the new combos help a bit, yes.  Total code size increased
a tiny bit; it looks to be all because of unfortunate scheduling and
less tail merging.  The usual.

32-bit code is identical, as it should be ;-)


Segher


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2015 at 04:59:27PM +0100, Wilco Dijkstra wrote:
 However it also creates new dependencies that may not be desirable
 (such as hash table size, algorithm used etc).
 
  Some notes about ppc64 in particular:
  
* Constants aren't split until quite late, preventing all hope of
  CSE'ing portions of the generated code.  My gut feeling is that
  this is in general a mistake, but...
 
 Late split is best in general as you want to CSE the original constants,
 not parts of the expansion (which would be very rarely possible).

CSE handles the REG_EQ* notes just fine?  For rs6000 many constants
are split at expansion already, and I never saw problems from that?
Maybe I haven't looked at the right (wrong :-) ) testcases though.

  Comments?  Especially on the shared header?
 
 I'm not convinced the amount of code that could be shared is enough to be
 worthwhile.

I agree.

 Also the way it is written makes the immediate generation more 
 complex and likely consuming a lot of memory (each cached immediate requires
 at least 64 bytes).

It would take a bit less memory if it used pointers (to other cache elts)
for sub-expressions, but that makes cache invalidation more interesting,
and shared (within one constant) subexpressions non-trivial.

 It is not obvious to me why it is a good idea to hide the
 simple/fast cases behind the hashing scheme - it seems better that the backend
 explicitly decides which cases should be cached.

Without going through the abstraction you mean?  I'd rather there was
no such abstraction at all :-)

 I looked at the statistics of AArch64 immediate generation a while ago. 
 The interesting thing is ~95% of calls are queries, and the same query is on 
 average repeated 10 times in a row.

Huh.

 So (a) it is not important to cache the expansions,

It also stores the expansion from analysis until emission time.  This
simplifies the code a lot.

Maybe that can be done without caching everything (and still not be
too expensive), dunno.

 and (b) the high repetition rate means a single-entry cache
 has a 90% hitrate.

And a 10% miss rate...

 We already have a patch for this and could collect stats
 comparing the approaches. If a single-entry cache can provide a similar 
 benefit as caching all immediates then my preference would be to keep things
 simple and just cache the last query.
 
 Note the many repeated queries indicate a performance issue at a much higher 
 level (repeated cost queries on the same unchanged RTL), and solving that 
 problem will likely improve buildtime for all targets.

You cannot see if RTL has changed from the pointer to it, so we'd have
to store that info (the unchanged info, or the cost info) somewhere.
Maybe it would be useful to store it in the RTL itself?


Segher


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Earnshaw
On 12/08/15 09:32, Richard Earnshaw wrote:
 On 12/08/15 02:11, Richard Henderson wrote:
 Something last week had me looking at ppc64 code generation,
 and some of what I saw was fairly bad.  Fixing it wasn't going
 to be easy, due to the fact that the logic for generating
 constants wasn't contained within a single function.

 Better is the way that aarch64 and alpha have done it in the
 past, sharing a single function with all of the logical that
 can be used for both cost calculation and the actual emission
 of the constants.

 However, the way that aarch64 and alpha have done it hasn't
 been ideal, in that there's a fairly costly search that must
 be done every time.  I've thought before about changing this
 so that we would be able to cache results, akin to how we do
 it in expmed.c for multiplication.

 I've implemented such a caching scheme for three targets, as
 a test of how much code could be shared.  The answer appears
 to be about 100 lines of boiler-plate.  Minimal, true, but it
 may still be worth it as a way of encouraging backends to do
 similar things in a similar way.

 
 I've got a short week this week, so won't have time to look at this in
 detail for a while.  So a bunch of questions... but not necessarily
 objections :-)
 
 How do we clear the cache, and when?  For example, on ARM, switching
 between ARM and Thumb state means we need to generate potentially
 radically different sequences?  We can do such splitting at function
 boundaries now.
 
 Can we generate different sequences for hot/cold code within a single
 function?
 
 Can we cache sequences with the context (eg use with AND, OR, ADD, etc)?
 
 
 Some notes about ppc64 in particular:

   * Constants aren't split until quite late, preventing all hope of
 CSE'ing portions of the generated code.  My gut feeling is that
 this is in general a mistake, but...

 I did attempt to fix it, and got nothing for my troubles except
 poorer code generation for AND/IOR/XOR with non-trivial constants.

 On AArch64 in particular, building complex constants is generally
 destructive on the source register (if you want to preserve intermediate
 values you have to make intermediate copies); that's clearly never going
 to be a win if you don't need at least 3 instructions to form the
 constant.
 
 There might be some cases where you could form a second constant as a
 difference from an earlier one, but that then creates data-flow
 dependencies and in OoO machines that might not be worth-while.  Even
 for in-order machines it can restrict scheduling and result in worse code.
 
 
 I'm somewhat surprised that the operands to the logicals aren't
 visible at rtl generation time, given all the work done in gimple.
 And failing that, combine has enough REG_EQUAL notes that it ought
 to be able to put things back together and see the simpler pattern.

 
 We've tried it in the past.  Exposing the individual steps prevents the
 higher-level rtl-based optimizations since they can no-longer deal with
 the complete sub-expression.

Eg. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724

R.



[RFC] Add check_effective_target_vect_min_max

2015-08-12 Thread Tom de Vries

[ was: Re: [committed, testsuite] Fix vect/trapv-vect-reduc-4.c for sparc  ]

On 12/08/15 07:57, Tom de Vries wrote:

Hi,

For sparc, the scan for 'vectorized 2 loops' in
vect/trapv-vect-reduc-4.c fail because the vector instructions min and
max are not supported.

Tested on x86_64. Tested on sparc by Rainer.



This follow-up patch introduces a new effective target vect_min_max, 
similar to how effective target vect_bswap is implemented.


Any comments?

Thanks,
- Tom

Add check_effective_target_vect_min_max

2015-08-12  Tom de Vries  t...@codesourcery.com

	* lib/target-supports.exp (check_effective_target_vect_min_max): New
	proc.
	* gcc.dg/vect/trapv-vect-reduc-4.c: Use vect_min_max effective target.
---
 gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c |  2 +-
 gcc/testsuite/lib/target-supports.exp  | 18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c b/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
index 8fd353c..1624696 100644
--- a/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
+++ b/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
@@ -47,4 +47,4 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times Detected reduction\\. 2 vect  } } */
-/* { dg-final { scan-tree-dump-times vectorized 2 loops 1 vect { target { ! { sparc*-*-* } } } } } */
+/* { dg-final { scan-tree-dump-times vectorized 2 loops 1 vect { target { vect_min_max } } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 1988301..c585e5f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3568,6 +3568,24 @@ proc check_effective_target_vect_bswap { } {
 return $et_vect_bswap_saved
 }
 
+# Return 1 if the target supports vector min/max operations.
+
+proc check_effective_target_vect_min_max { } {
+global et_vect_min_max_saved
+
+if [info exists et_vect_min_max_saved] {
+	verbose check_effective_target_vect_min_max: using cached result 2
+} else {
+	set et_vect_min_max_saved 1
+	if { [istarget sparc*-*-*] } {
+	   set et_vect_min_max_saved 0
+	}
+}
+
+verbose check_effective_target_vect_min_max: returning $et_vect_min_max_saved 2
+return $et_vect_min_max_saved
+}
+
 # Return 1 if the target supports hardware vector shift operation for char.
 
 proc check_effective_target_vect_shift_char { } {
-- 
1.9.1



Re: [RFC] Add check_effective_target_vect_min_max

2015-08-12 Thread Rainer Orth
Tom de Vries tom_devr...@mentor.com writes:

 This follow-up patch introduces a new effective target vect_min_max,
 similar to how effective target vect_bswap is implemented.

 Any comments?

 Thanks,
 - Tom

 Add check_effective_target_vect_min_max

 2015-08-12  Tom de Vries  t...@codesourcery.com

   * lib/target-supports.exp (check_effective_target_vect_min_max): New
   proc.
   * gcc.dg/vect/trapv-vect-reduc-4.c: Use vect_min_max effective target.

Looks good to me, but the new effective-target keyword needs documenting
in sourcebuild.texi.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [optimize3/3] VRP min/max exprs

2015-08-12 Thread Richard Biener
On Tue, 11 Aug 2015, Nathan Sidwell wrote:

 On 08/11/15 07:41, Richard Biener wrote:
 
  The patch looks good.  Note that with SSA name operands it can be
  still profitable to do compare_range_with_value, esp. if the other
  operand has a symbolical range.  See
  vrp_evaluate_conditional_warnv_with_ops_using_ranges which actually
  implements a nice helper for evaluating comparisons for code-gen
  transforms.
 
 Ok, like this?  Thanks for your help!
 
 tested on x86_64-linux.

Ok.

Thanks,
Richard.


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Earnshaw
On 12/08/15 02:11, Richard Henderson wrote:
 Something last week had me looking at ppc64 code generation,
 and some of what I saw was fairly bad.  Fixing it wasn't going
 to be easy, due to the fact that the logic for generating
 constants wasn't contained within a single function.
 
 Better is the way that aarch64 and alpha have done it in the
 past, sharing a single function with all of the logical that
 can be used for both cost calculation and the actual emission
 of the constants.
 
 However, the way that aarch64 and alpha have done it hasn't
 been ideal, in that there's a fairly costly search that must
 be done every time.  I've thought before about changing this
 so that we would be able to cache results, akin to how we do
 it in expmed.c for multiplication.
 
 I've implemented such a caching scheme for three targets, as
 a test of how much code could be shared.  The answer appears
 to be about 100 lines of boiler-plate.  Minimal, true, but it
 may still be worth it as a way of encouraging backends to do
 similar things in a similar way.
 

I've got a short week this week, so won't have time to look at this in
detail for a while.  So a bunch of questions... but not necessarily
objections :-)

How do we clear the cache, and when?  For example, on ARM, switching
between ARM and Thumb state means we need to generate potentially
radically different sequences?  We can do such splitting at function
boundaries now.

Can we generate different sequences for hot/cold code within a single
function?

Can we cache sequences with the context (eg use with AND, OR, ADD, etc)?


 Some notes about ppc64 in particular:
 
   * Constants aren't split until quite late, preventing all hope of
 CSE'ing portions of the generated code.  My gut feeling is that
 this is in general a mistake, but...
 
 I did attempt to fix it, and got nothing for my troubles except
 poorer code generation for AND/IOR/XOR with non-trivial constants.
 
On AArch64 in particular, building complex constants is generally
destructive on the source register (if you want to preserve intermediate
values you have to make intermediate copies); that's clearly never going
to be a win if you don't need at least 3 instructions to form the
constant.

There might be some cases where you could form a second constant as a
difference from an earlier one, but that then creates data-flow
dependencies and in OoO machines that might not be worth-while.  Even
for in-order machines it can restrict scheduling and result in worse code.


 I'm somewhat surprised that the operands to the logicals aren't
 visible at rtl generation time, given all the work done in gimple.
 And failing that, combine has enough REG_EQUAL notes that it ought
 to be able to put things back together and see the simpler pattern.
 

We've tried it in the past.  Exposing the individual steps prevents the
higher-level rtl-based optimizations since they can no-longer deal with
the complete sub-expression.

 Perhaps there's some other predication or costing error that's
 getting in the way, and it simply wasn't obvious to me.   In any
 case, nothing in this patch set addresses this at all.
 
   * I go on to add 4 new methods of generating a constant, each of
 which typically saves 2 insns over the current algorithm.  There
 are a couple more that might be useful but...
 
   * Constants are split *really* late.  In particular, after reload.
 It would be awesome if we could at least have them all split before
 register allocation so that we arrange to use ADDI and ADDIS when
 that could save a few instructions.  But that does of course mean
 avoiding r0 for the input.  Again, nothing here attempts to change
 when constants are split.
 

certainly in the ARM port we try to split immediately before register
allocation, that way we can be sure that we have scratch registers
available if that helps with generating more efficient sequences.

R.

   * This is the only platform for which I bothered collecting any sort
 of performance data:
 
 As best I can tell, there is a 9% improvement in bootstrap speed
 for ppc64.  That is, 10 minutes off the original 109 minute build.
 
 For aarch64 and alpha, I simply assumed there would be no loss,
 since the basic search algorithm is unchanged for each.
 
 Comments?  Especially on the shared header?
 
 
 r~
 
 Cc: David Edelsohn dje@gmail.com
 Cc: Marcus Shawcroft marcus.shawcr...@arm.com
 Cc: Richard Earnshaw richard.earns...@arm.com
 
 Richard Henderson (15):
   rs6000: Split out rs6000_is_valid_and_mask_wide
   rs6000: Make num_insns_constant_wide static
   rs6000: Tidy num_insns_constant vs CONST_DOUBLE
   rs6000: Implement set_const_data infrastructure
   rs6000: Move constant via mask into build_set_const_data
   rs6000: Use rldiwi in constant construction
   rs6000: Generalize left shift in constant generation
   rs6000: Generalize masking in constant generation
   

Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Segher Boessenkool
Hi!

This looks really nice.  I'll try it out soon :-)

Some comments now...


On Tue, Aug 11, 2015 at 06:11:29PM -0700, Richard Henderson wrote:
 However, the way that aarch64 and alpha have done it hasn't
 been ideal, in that there's a fairly costly search that must
 be done every time.  I've thought before about changing this
 so that we would be able to cache results, akin to how we do
 it in expmed.c for multiplication.

Is there something that makes the cache not get too big?  Do we
care, anyway?

 Some notes about ppc64 in particular:
 
   * Constants aren't split until quite late, preventing all hope of
 CSE'ing portions of the generated code.  My gut feeling is that
 this is in general a mistake, but...

Constant arguments to IOR/XOR/AND that can be done with two machine
insns are split at expand.  Then combine comes along and just loves
to recombine them, but then they are split again at split1 (before
RA).

For AND this was optimal in my experiments; for IOR/XOR it has been
this way since the dawn of time.

Simple SETs aren't split at expand, maybe they should be.  But they
are split at split1.

 I did attempt to fix it, and got nothing for my troubles except
 poorer code generation for AND/IOR/XOR with non-trivial constants.

Could you give an example of code that isn't split early enough?

 I'm somewhat surprised that the operands to the logicals aren't
 visible at rtl generation time, given all the work done in gimple.

So am I, because that is not what I'm seeing?  E.g.

int f(int x) { return x | 0x12345678; }

is expanded as two IORs already.  There must be something in your
testcases that prevents this?

 And failing that, combine has enough REG_EQUAL notes that it ought
 to be able to put things back together and see the simpler pattern.
 
 Perhaps there's some other predication or costing error that's
 getting in the way, and it simply wasn't obvious to me.   In any
 case, nothing in this patch set addresses this at all.

The instruction (set (reg) (const_int 0x12345678)) is costed as 4
(i.e. one insn).  That cannot be good.  This is alternative #5 in
*movsi_internal1_single (there are many more variants of that
pattern).

   * I go on to add 4 new methods of generating a constant, each of
 which typically saves 2 insns over the current algorithm.  There
 are a couple more that might be useful but...

New methods look to be really simple to add with your framework,
very nice :-)

   * Constants are split *really* late.  In particular, after reload.

Yeah that is bad.  But I'm still not seeing it.  Hrm, maybe only
DImode ones?

 It would be awesome if we could at least have them all split before
 register allocation

And before sched1, yeah.

 so that we arrange to use ADDI and ADDIS when
 that could save a few instructions.  But that does of course mean
 avoiding r0 for the input.

That is no problem at all before RA.

 Again, nothing here attempts to change
 when constants are split.
 
   * This is the only platform for which I bothered collecting any sort
 of performance data:
 
 As best I can tell, there is a 9% improvement in bootstrap speed
 for ppc64.  That is, 10 minutes off the original 109 minute build.

That is, wow.  Wow :-)

Have you looked at generated code quality?


Segher


[PATCH][1/2] Make SCCVN use conditional equivalences

2015-08-12 Thread Richard Biener

This is the first patch in the series to make SCCVN able to remove
redundant comparisons by recording conditionals being true on
visited edges.  This part of the series fully transitions the
toplevel walk gathering entries to SCCs we value-number to the
DOM walk which at the moment only visits last stmts of BBs.
After this series we can safely insert expressions into the
SCCVN tables temporarily for uses dominated by the stmt generating
them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-08-12  Richard Biener  rguent...@suse.de

* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
Eliminate edges marked as not executable by SCCVN.
* tree-ssa-sccvn.c: Include gimple-iterator.h.
(cond_dom_walker): Rename to sccvn_dom_walker.
(sccvn_dom_walker::before_dom_children): Value-number defs
of all stmts.
(run_scc_vn): Remove loop value-numbering all SSA names.
Drop not visited SSA names to varying.

* gcc.dg/tree-ssa/ssa-fre-43.c: Adjust.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 226777)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4291,7 +4291,31 @@ eliminate_dom_walker::before_dom_childre
   el_to_remove.safe_push (stmt);
  continue;
 }
-}
+   }
+
+  /* If this is a control statement value numbering left edges
+unexecuted on force the condition in a way consistent with
+that.  */
+  if (gcond *cond = dyn_cast gcond * (stmt))
+   {
+ if ((EDGE_SUCC (b, 0)-flags  EDGE_EXECUTABLE)
+ ^ (EDGE_SUCC (b, 1)-flags  EDGE_EXECUTABLE))
+   {
+  if (dump_file  (dump_flags  TDF_DETAILS))
+{
+  fprintf (dump_file, Removing unexecutable edge from );
+  print_gimple_stmt (dump_file, stmt, 0, 0);
+}
+ if (((EDGE_SUCC (b, 0)-flags  EDGE_TRUE_VALUE) != 0)
+ == ((EDGE_SUCC (b, 0)-flags  EDGE_EXECUTABLE) != 0))
+   gimple_cond_make_true (cond);
+ else
+   gimple_cond_make_false (cond);
+ update_stmt (cond);
+ el_todo |= TODO_cleanup_cfg;
+ continue;
+   }
+   }
 
   bool can_make_abnormal_goto = stmt_can_make_abnormal_goto (stmt);
   bool was_noreturn = (is_gimple_call (stmt)
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 226777)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.
 #include tree-cfg.h
 #include domwalk.h
 #include cgraph.h
+#include gimple-iterator.h
 
 /* This algorithm is based on the SCC algorithm presented by Keith
Cooper and L. Taylor Simpson in SCC-Based Value numbering
@@ -4277,18 +4278,20 @@ set_hashtable_value_ids (void)
 set_value_id_for_result (vr-result, vr-value_id);
 }
 
-class cond_dom_walker : public dom_walker
+class sccvn_dom_walker : public dom_walker
 {
 public:
-  cond_dom_walker () : dom_walker (CDI_DOMINATORS), fail (false) {}
+  sccvn_dom_walker () : dom_walker (CDI_DOMINATORS), fail (false) {}
 
   virtual void before_dom_children (basic_block);
 
   bool fail;
 };
 
+/* Value number all statements in BB.  */
+
 void
-cond_dom_walker::before_dom_children (basic_block bb)
+sccvn_dom_walker::before_dom_children (basic_block bb)
 {
   edge e;
   edge_iterator ei;
@@ -4317,6 +4320,34 @@ cond_dom_walker::before_dom_children (ba
   return;
 }
 
+  /* Value-number all defs in the basic-block.  */
+  for (gphi_iterator gsi = gsi_start_phis (bb);
+   !gsi_end_p (gsi); gsi_next (gsi))
+{
+  gphi *phi = gsi.phi ();
+  tree res = PHI_RESULT (phi);
+  if (!VN_INFO (res)-visited
+  !DFS (res))
+   {
+ fail = true;
+ return;
+   }
+}
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+   !gsi_end_p (gsi); gsi_next (gsi))
+{
+  ssa_op_iter i;
+  tree op;
+  FOR_EACH_SSA_TREE_OPERAND (op, gsi_stmt (gsi), i, SSA_OP_ALL_DEFS)
+   if (!VN_INFO (op)-visited
+!DFS (op))
+ {
+   fail = true;
+   return;
+ }
+}
+
+  /* Finally look at the last stmt.  */
   gimple stmt = last_stmt (bb);
   if (!stmt)
 return;
@@ -4329,8 +4360,7 @@ cond_dom_walker::before_dom_children (ba
 
   if (dump_file  (dump_flags  TDF_DETAILS))
 {
-  fprintf (dump_file, Value-numbering operands of stmt ending BB %d: ,
-  bb-index);
+  fprintf (dump_file, Visiting stmt ending BB %d: , bb-index);
   print_gimple_stmt (dump_file, stmt, 0, 0);
 }
 
@@ -4338,12 +4368,8 @@ cond_dom_walker::before_dom_children (ba
   ssa_op_iter i;
   tree op;
   FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_USE)
-if (VN_INFO (op)-visited == false
-  

Re: [PATCH][RFC] Re-work GIMPLE checking to be gimple class friendly (even better)

2015-08-12 Thread Richard Biener
On Mon, 3 Aug 2015, Richard Biener wrote:

 On Mon, 27 Jul 2015, Richard Biener wrote:
 
  On Mon, 27 Jul 2015, Richard Biener wrote:
  
   On Mon, 27 Jul 2015, Richard Biener wrote:
   

I noticed that the code we generate for a simple gimple_assign_rhs1 
(stmt)
is quite bad as we have two checking pieces.  The implementation is now

 static inline tree
 gimple_assign_rhs1 (const_gimple gs)
 {
   GIMPLE_CHECK (gs, GIMPLE_ASSIGN);
   return gimple_op (gs, 1);
 }

and the hidden checking is due to gimple_op being

static inline tree *
gimple_ops (gimple gs)
{
  size_t off;

  /* All the tuples have their operand vector at the very bottom
 of the structure.  Note that those structures that do not
 have an operand vector have a zero offset.  */
  off = gimple_ops_offset_[gimple_statement_structure (gs)];
  gcc_gimple_checking_assert (off != 0);

  return (tree *) ((char *) gs + off);

(ugly in itself considering that we just verified we have a gassign
which has an operand vector as member...).  gimple_op incurs two
additional memory reference to compute gimple_statement_structure
and gimple_ops_offset_ plus the checking bits.

The simplest thing would be to use

  GIMPLE_CHECK (gs, GIMPLE_ASSIGN);
  return as_a const gassign * (gs)-op[1];

but that's not optimal either because we check we have a gassign
again (not optimized in stage1).  So I decided to invent a fancy
as_a which reports errors similar to GIMPLE_CHECK and makes the
code look like

  const gassign *ass = GIMPLE_CHECK2const gassign * (gs);
  return ass-op[1];

for some reason I needed to overload is_a_helper for const_gimple
(I thought the is_a machinery would transparently support qualified
types?).

I'm missing a fallback for !ENABLE_GIMPLE_CHECKING as well
as an overload for 'gimple' I guess.  I just changed
gimple_assign_rhs1 for now.

So this is a RFC.

I understand in the end the whole GCC may use gassign everywhere
we access gimple_assign_rhs1 but I don't see this happening soon
and this simple proof-of-concept patch already reduces unoptimized
text size of gimple-match.c from 836080 to 836359 (too bad) and
optimized from 585731 to 193190 bytes (yay!) on x86_64 using trunk.
Some inlining must go very differently - we just have 544 calls to
gimple_assign_rhs1 in gimple-match.c.  .optimizes shows all calls
remaining as

  bb 89:
  _ZL18gimple_assign_rhs1PK21gimple_statement_base.part.32 
(def_stmt_47);

which is

tree_node* gimple_assign_rhs1(const_gimple) (const struct 
gimple_statement_base * gs)
{
  bb 2:
  gimple_check_failed (gs_1(D), 
/space/rguenther/tramp3d/trunk/gcc/gimple.h[0], 2381, 
gimple_assign_rhs1[0], 6, 0);

}

so eventually we just do a better job with profile estimation.
   
   Ok, numbers are off because of a typo I introduced late.
   
+  if (ret)
+gimple_check_failed (gs, file, line, fun, T()-code_, ERROR_MARK);
   
   should be if (!ret) of course ...
   
   Optimized code size after the patch is 588878 (we still avoid
   a lot of code and the checking fails are still split out).  Not sure
   where the code size increase is from.  Unoptimized code size
   after the patch is 836233.
  
  And here is a more complete variant (covering all op accesses of
  gimple assigns).  stage2 cc1plus size drops from
  
 textdata bss dec hex filename
  24970991  73776 1392952 264377191936857 
  /abuild/rguenther/obj2/prev-gcc/cc1plus
  
  to
  
textdata bss dec hex filename
  24809991  69608 1388536 26268135190d1e7 
  ../obj/prev-gcc/cc1plus
 
 So we tried to come up with the reason that there are no gassign *
 overloads already in place.  And remembered sth about C++ and not
 working overload resolution and such.  But experiments show that
 all cases we tried are handled well (including const gassign * - gimple
 conversions).  So here is a variant of the patch adding those
 overloads and making the gimple variants forwarders with just the
 (new and fancy) GIMPLE checking.
 
 I'm leaving this for discussion again but plan to commit it this time
 if there is no further response (after Cauldron, that is).
 
 Btw, I wonder what happened to the gimple - gimple * conversion.
 
 Bootstrap  regtest running on x86_64-unknown-linux-gnu.

Applied to trunk as r226802.

Richard.

 Richard.
 
 2015-07-27  Richard Biener  rguent...@suse.de
 
   * gimple.h (remove_pointer): New trait.
   (GIMPLE_CHECK2): New inline template function.
   (gassign::code_): New constant static member.
   (is_a_helperconst gassign *): Add.
   (gimple_assign_lhs): Use GIMPLE_CHECK2 in the gimple overload
   and forward to a new gassign 

[gomp4] Merge trunk r226769 (2015-08-10) into gomp-4_0-branch

2015-08-12 Thread Thomas Schwinge
Hi!

Committed to gomp-4_0-branch in r226803:

commit e2f0fc69bc92b2232573cfafe5f4975551eb05e0
Merge: 66f94e8 b5c93b0
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Aug 12 09:34:27 2015 +

svn merge -r 225562:226769 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@226803 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße,
 Thomas


pgpNt3fiispgE.pgp
Description: PGP signature


Re: [RFC] Add check_effective_target_vect_min_max

2015-08-12 Thread Richard Biener
On Wed, 12 Aug 2015, Rainer Orth wrote:

 Tom de Vries tom_devr...@mentor.com writes:
 
  On 12/08/15 10:51, Rainer Orth wrote:
  Tom de Vries tom_devr...@mentor.com writes:
 
  This follow-up patch introduces a new effective target vect_min_max,
  similar to how effective target vect_bswap is implemented.
 
  Any comments?
 
  Thanks,
  - Tom
 
  Add check_effective_target_vect_min_max
 
  2015-08-12  Tom de Vries  t...@codesourcery.com
 
* lib/target-supports.exp (check_effective_target_vect_min_max): New
proc.
* gcc.dg/vect/trapv-vect-reduc-4.c: Use vect_min_max effective target.
 
  Looks good to me, but the new effective-target keyword needs documenting
  in sourcebuild.texi.
 
 
  Hmm, in sourcebuild.texi I found:
  ...
  @item vect_no_int_max
  Target does not support a vector max instruction on @code{int}.
  ...
 
  That looks related. [ I also found a patch introducing vect_no_uint_max
  here: https://gcc.gnu.org/ml/gcc-patches/2010-01/msg00152.html. ]
 
  I'm not sure where to take it from here. Should I introduce
  vect_no_int_min, and use that in combination with vect_no_int_max?
 
 I'd say this is something for the vectorizer maintainers to decide.  Richi?

I expect the above is already effectively
vect_no_int_min as well (which target would support min but not max...?).

So after double-checking that you could rename it to vect_no_int_min_max.

Richard.


[Aarch64][1/3] Use atomic compare-and-swap instructions when available.

2015-08-12 Thread Matthew Wahab

ARMv8.1 adds instructions for atomic compare-and-swap with optional
memory ordering specifiers. This patch series adds the instructions to
GCC, making them available with -march=armv8.1-a or with
-march=armv8-a+lse, and using them to implement the __sync and __atomic
builtins.

This patch adds an internal TARGET_LSE macro, to check target support
for the atomic instructions.

Subsequent patches
- add and use atomic compare-and-swap instructions;
- add tests for the compare-and-swap;

Tested the series for aarch64-none-linux-gnu with native bootstrap and make
check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
ARMv8.1.

Ok for trunk?
Matthew

2015-08-12  Matthew Wahab  matthew.wa...@arm.com

* config/aarch64/aarch64.h (AARCH64_ISA_LSE): New.
(TARGET_LSE): New.
From 50a71e7572de7eb47c7913ae3d0984cd61f8d425 Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Mon, 20 Jul 2015 12:42:52 +0100
Subject: [PATCH 1/3] Add LSE flag testing macro.

Change-Id: I4bfeb5ec617f13026fb7dcd28a9b5c7efa4c1719
---
 gcc/config/aarch64/aarch64.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 721927f..d6c7a74 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -156,6 +156,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_CRYPTO (aarch64_isa_flags  AARCH64_FL_CRYPTO)
 #define AARCH64_ISA_FP (aarch64_isa_flags  AARCH64_FL_FP)
 #define AARCH64_ISA_SIMD   (aarch64_isa_flags  AARCH64_FL_SIMD)
+#define AARCH64_ISA_LSE		   (aarch64_isa_flags  AARCH64_FL_LSE)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD  AARCH64_ISA_CRYPTO)
@@ -163,6 +164,9 @@ extern unsigned aarch64_architecture_version;
 /* CRC instructions that can be enabled through +crc arch extension.  */
 #define TARGET_CRC32 (AARCH64_ISA_CRC)
 
+/* Atomic instructions that can be enabled through the +lse extension.  */
+#define TARGET_LSE (AARCH64_ISA_LSE)
+
 /* Make sure this is always defined so we don't have to check for ifdefs
but rather use normal ifs.  */
 #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
-- 
1.9.1



Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Richard Biener
On Wed, Aug 12, 2015 at 1:07 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
 On 2015.08.12 at 13:01 +0200, Richard Biener wrote:
 On Tue, Aug 11, 2015 at 9:49 PM, Jeff Law l...@redhat.com wrote:
  On 08/06/2015 04:25 AM, Mikael Morin wrote:
 
  Hello,
 
  this avoids an error found with bootstrap-ubsan.
  Regression tested on x86_64-unknown-linux-gnu.  OK for trunk?
 
  Mikael
 
 
  noub_sext.CL
 
 
  2015-08-05  Mikael Morinmik...@gcc.gnu.org
 
  * hwint.h (sext_hwi): Rewrite without undefined behaviour on
  negative SRC.
 
  OK.  Hopefully most of the time the precision is known at compile-time 
  which
  would allow for optimization of the resulting code back to the
  pair-of-shifts form by combine.

 I think it is not.  The code also lacks a comment on why we do this kind
 of obfuscation.

 What kind of error does ubsan run into?  That is, for which 'prec'?

 See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67042

Ugh.  Stupid C.

My fear is that with so many stmts sext_hwi isn't any longer considered for
inlining, even if prec is constant.  What's the generated code for its
out-of line
copy with/without the patch?

Richard.

 --
 Markus


[PATCH][2/2] Make SCCVN use conditional equivalences

2015-08-12 Thread Richard Biener

This brings FRE/PRE up to the same level as DOM in being able to
remove redundant conditionals.  It does so by inserting temporary
conditional expressions proved to be true on single predecessor
edges.

I've had to do a lot of testcase adjustments, thus the patch is
now re-bootstrapping / testing on x86_64-unknown-linux-gnu.

Richard.

2015-08-12  Richard Biener  rguent...@suse.de

* tree-ssa-sccvn.c (vn_nary_op_compute_hash): Also canonicalize
comparison operand order and commutative ternary op operand order.
(sccvn_dom_walker::cond_stack): New state to track temporary
expressions.
(sccvn_dom_walker::after_dom_children): Remove tempoary expressions
no longer valid.
(sccvn_dom_walker::record_cond): Add a single temporary conditional
expression.
(sccvn_dom_walker::record_conds): Add a temporary conditional
expressions and all related expressions also true/false.
(sccvn_dom_walker::before_dom_children): Record temporary
expressions based on the controlling condition of a single
predecessor.  When trying to simplify a conditional statement
lookup expressions we might have inserted earlier.

* testsuite/g++.dg/tree-ssa/pr61034.C
* testsuite/gcc.dg/fold-compare-2.c
* testsuite/gcc.dg/pr50763.c
* testsuite/gcc.dg/predict-3.c
* testsuite/gcc.dg/tree-ssa/20030709-2.c
* testsuite/gcc.dg/tree-ssa/pr19831-3.c
* testsuite/gcc.dg/tree-ssa/pr20657.c
* testsuite/gcc.dg/tree-ssa/pr21001.c
* testsuite/gcc.dg/tree-ssa/pr37508.c
* testsuite/gcc.dg/tree-ssa/ssa-fre-47.c
* testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
* testsuite/gcc.dg/tree-ssa/ssa-fre-49.c
* testsuite/gcc.dg/tree-ssa/vrp04.c
* testsuite/gcc.dg/tree-ssa/vrp07.c
* testsuite/gcc.dg/tree-ssa/vrp09.c
* testsuite/gcc.dg/tree-ssa/vrp16.c
* testsuite/gcc.dg/tree-ssa/vrp20.c
* testsuite/gcc.dg/tree-ssa/vrp25.c
* testsuite/gcc.dg/tree-ssa/vrp87.c

Index: gcc/testsuite/g++.dg/tree-ssa/pr61034.C
===
--- gcc/testsuite/g++.dg/tree-ssa/pr61034.C (revision 226802)
+++ gcc/testsuite/g++.dg/tree-ssa/pr61034.C (working copy)
@@ -43,5 +43,5 @@ bool f(I a, I b, I c, I d) {
 // This works only if everything is inlined into 'f'.
 
 // { dg-final { scan-tree-dump-times ;; Function 1 fre2 } }
-// { dg-final { scan-tree-dump-times free 18 fre2 } }
+// { dg-final { scan-tree-dump-times free 10 fre2 } }
 // { dg-final { scan-tree-dump-times unreachable 11 fre2 } }
Index: gcc/testsuite/gcc.dg/fold-compare-2.c
===
--- gcc/testsuite/gcc.dg/fold-compare-2.c   (revision 226802)
+++ gcc/testsuite/gcc.dg/fold-compare-2.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fno-tree-tail-merge -fdump-tree-vrp1 } */
+/* { dg-options -O2 -fdump-tree-fre1 } */
 
 extern void abort (void);
 
@@ -15,5 +15,5 @@ main(void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times Removing basic block 2 vrp1 } } */
+/* { dg-final { scan-tree-dump-times Removing basic block 2 fre1 } } */
 
Index: gcc/testsuite/gcc.dg/pr50763.c
===
--- gcc/testsuite/gcc.dg/pr50763.c  (revision 226802)
+++ gcc/testsuite/gcc.dg/pr50763.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -ftree-tail-merge -fno-tree-dominator-opts 
-fdump-tree-pre } */
+/* { dg-options -O2 -ftree-tail-merge -fno-tree-dominator-opts } */
 
 int bar (int i);
 
@@ -11,5 +11,3 @@ foo (int c, int d)
   d = 33;
   while (c == d);
 }
-
-/* { dg-final { scan-tree-dump-times == 33 2 pre} } */
Index: gcc/testsuite/gcc.dg/predict-3.c
===
--- gcc/testsuite/gcc.dg/predict-3.c(revision 226802)
+++ gcc/testsuite/gcc.dg/predict-3.c(working copy)
@@ -12,6 +12,10 @@ void foo (int bound)
 {
   if (i  bound - 2)
global += bar (i);
+  /* The following test is redundant with the loop bound check in the
+ for stmt and thus eliminated by FRE which makes the controlled
+stmt always executed and thus equivalent to 100%.  Thus the
+heuristic only applies three times.  */
   if (i = bound)
global += bar (i);
   if (i + 1  bound)
@@ -21,4 +25,4 @@ void foo (int bound)
 }
 }
 
-/* { dg-final { scan-tree-dump-times loop iv compare heuristics: 100.0% 4 
profile_estimate} } */
+/* { dg-final { scan-tree-dump-times loop iv compare heuristics: 100.0% 3 
profile_estimate} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c  (revision 226802)
+++ gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c  (working copy)
@@ 

Re: [RFC] Add check_effective_target_vect_min_max

2015-08-12 Thread Tom de Vries

On 12/08/15 10:51, Rainer Orth wrote:

Tom de Vries tom_devr...@mentor.com writes:


This follow-up patch introduces a new effective target vect_min_max,
similar to how effective target vect_bswap is implemented.

Any comments?

Thanks,
- Tom

Add check_effective_target_vect_min_max

2015-08-12  Tom de Vries  t...@codesourcery.com

* lib/target-supports.exp (check_effective_target_vect_min_max): New
proc.
* gcc.dg/vect/trapv-vect-reduc-4.c: Use vect_min_max effective target.


Looks good to me, but the new effective-target keyword needs documenting
in sourcebuild.texi.



Hmm, in sourcebuild.texi I found:
...
@item vect_no_int_max
Target does not support a vector max instruction on @code{int}.
...

That looks related. [ I also found a patch introducing vect_no_uint_max 
here: https://gcc.gnu.org/ml/gcc-patches/2010-01/msg00152.html. ]


I'm not sure where to take it from here. Should I introduce 
vect_no_int_min, and use that in combination with vect_no_int_max?


Thanks,
- Tom


[gomp4] Work around expand_GOACC_DIM_SIZE/expand_GOACC_DIM_POS ICEs (was: signed nums are better for dimensions)

2015-08-12 Thread Thomas Schwinge
Hi!

On Tue, 11 Aug 2015 13:38:34 -0400, Nathan Sidwell nat...@acm.org wrote:
 2) We really should not be getting to the expanders if there's nothing to 
 expand 
 to.  That's simply covering up lack of earlier handling.  That earlier 
 removal 
 gives optimizers more leeway.

 --- internal-fn.c (revision 226770)
 +++ internal-fn.c (working copy)

  static void
 -expand_GOACC_FORK (gcall *stmt ATTRIBUTE_UNUSED)
 +expand_GOACC_FORK (gcall *ARG_UNUSED (stmt))
  {
  #ifdef HAVE_oacc_fork
 [...]
 +#else
 +  gcc_unreachable ();
  #endif
  }
  
  static void
 -expand_GOACC_JOIN (gcall *stmt ATTRIBUTE_UNUSED)
 +expand_GOACC_JOIN (gcall *ARG_UNUSED (stmt))
  {
  #ifdef HAVE_oacc_join
 [...]
 +#else
 +  gcc_unreachable ();
  #endif
  }

  static void
 -expand_GOACC_DIM_SIZE (gcall *stmt)
 +expand_GOACC_DIM_SIZE (gcall *ARG_UNUSED (stmt))
  {
 +#ifdef HAVE_oacc_dim_size
 [...]
  #else
 -  emit_move_insn (target, const1_rtx);
 +  gcc_unreachable ();
  #endif
  }
  
  static void
 -expand_GOACC_DIM_POS (gcall *stmt)
 +expand_GOACC_DIM_POS (gcall *ARG_UNUSED (stmt))
  {
 +#ifdef HAVE_oacc_dim_pos
 [...]
  #else
 -  emit_move_insn (target, const0_rtx);
 +  gcc_unreachable ();
  #endif
  }

It's not an issue for expand_GOACC_FORK and expand_GOACC_JOIN, but
expand_GOACC_DIM_SIZE and expand_GOACC_DIM_POS then blow up when the
intelmic offloading compiler works through the OpenACC offloading code,
for example:


[...]/source-gcc/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/firstprivate-1.c:
 In function 'main._omp_fn.0':

[...]/source-gcc/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/firstprivate-1.c:18:11:
 internal compiler error: in expand_GOACC_DIM_POS, at internal-fn.c:2023
   #pragma acc parallel num_gangs (n) firstprivate (a)
   ^
0x90bce7 expand_GOACC_DIM_POS
[...]/source-gcc/gcc/internal-fn.c:2023
0x62945a expand_call_stmt
[...]/source-gcc/gcc/cfgexpand.c:2279
0x62945a expand_gimple_stmt_1
[...]/source-gcc/gcc/cfgexpand.c:3238
0x62945a expand_gimple_stmt
[...]/source-gcc/gcc/cfgexpand.c:3399
0x62a82d expand_gimple_basic_block
[...]/source-gcc/gcc/cfgexpand.c:5411
0x62fc86 execute
[...]/source-gcc/gcc/cfgexpand.c:6023
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
mkoffload-intelmic: fatal error: 
[...]/install/offload-x86_64-intelmicemul-linux-gnu/bin//x86_64-unknown-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc
 returned 1 exit status

Admittedly, compiling OpenACC offloading code for Intel MIC doesn't make
a lot of sense (currently), but it is what's being done, and has caused a
lot of regressions in my testing, so I committed the following workaround
to gomp-4_0-branch in r226804.  A different approach would have been to
invent some machinery to not compile OpenACC offloading code on the Intel
MIC offloading path, but as the latter eventually is to support OpenACC
offloading, too (see
http://news.gmane.org/find-root.php?message_id=%3C20141112111207.GW5026%40tucnak.redhat.com%3E,
for example), this seemed like a step into the wrong direction.
Eventually, this will need to be fixed differently/properly -- have to
implement oacc_dim_size and oacc_dim_pos for Intel MIC offloading, or can
we have a generic fallback solution not specific to the offloading
target?

commit 8a3187f17e13bd45e630ff8a587c1fc7086abece
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Aug 12 09:56:59 2015 +

Work around expand_GOACC_DIM_SIZE/expand_GOACC_DIM_POS ICEs

Followup to r226783.

gcc/
* internal-fn.c: Include builtins.h.
(expand_GOACC_DIM_SIZE, expand_GOACC_DIM_POS): Instead of
gcc_unreachable, expand_builtin_trap.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@226804 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |6 ++
 gcc/internal-fn.c  |5 +++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index ea254c5..044fdf7 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-08-12  Thomas Schwinge  tho...@codesourcery.com
+
+   * internal-fn.c: Include builtins.h.
+   (expand_GOACC_DIM_SIZE, expand_GOACC_DIM_POS): Instead of
+   gcc_unreachable, expand_builtin_trap.
+
 2015-08-10  Nathan Sidwell  nat...@acm.org
 
* tree-ssa-phiopt.c (minmax_replacement): Create new ssa name if
diff --git gcc/internal-fn.c gcc/internal-fn.c
index 8b8c6e1..70bffd4 100644
--- gcc/internal-fn.c
+++ gcc/internal-fn.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include stringpool.h
 #include tree-ssanames.h
 #include diagnostic-core.h
+#include builtins.h
 
 /* The names of each internal 

Re: [Aarch64][1/3] Use atomic compare-and-swap instructions when available.

2015-08-12 Thread James Greenhalgh
On Wed, Aug 12, 2015 at 11:12:29AM +0100, Matthew Wahab wrote:
 ARMv8.1 adds instructions for atomic compare-and-swap with optional
 memory ordering specifiers. This patch series adds the instructions to
 GCC, making them available with -march=armv8.1-a or with
 -march=armv8-a+lse, and using them to implement the __sync and __atomic
 builtins.
 
 This patch adds an internal TARGET_LSE macro, to check target support
 for the atomic instructions.
 
 Subsequent patches
 - add and use atomic compare-and-swap instructions;
 - add tests for the compare-and-swap;
 
 Tested the series for aarch64-none-linux-gnu with native bootstrap and make
 check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
 aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
 ARMv8.1.
 
 Ok for trunk?
 Matthew

OK.

Thanks,
James

 
 2015-08-12  Matthew Wahab  matthew.wa...@arm.com
 
   * config/aarch64/aarch64.h (AARCH64_ISA_LSE): New.
   (TARGET_LSE): New.



Re: [Aarch64][3/3] Add tests for atomic compare-and-swap instructions.

2015-08-12 Thread Andrew Pinski
On Wed, Aug 12, 2015 at 6:16 PM, Matthew Wahab
matthew.wa...@foss.arm.com wrote:
 ARMv8.1 adds instructions for atomic compare-and-swap with optional
 memory ordering specifiers. This patch adds tests for the
 compare-and-swap instructions as two files. The first is support code to
 run the test with a range of types and memory models, the second is the
 test for the CAS instruction.

 Tested the series for aarch64-none-linux-gnu with native bootstrap and make
 check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
 aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
 ARMv8.1.

 Ok for trunk?


I noticed you did not change the other cas testcases so they don't set
lse.  This can cause those testcases to fail if a cpu defaults to
having LSE.
I am going to be modifying a cpu to default to LSE enabled soonish.

Thanks,
Andrew

 Matthew

 gcc/testsuite
 2015-08-12  Matthew Wahab  matthew.wa...@arm.com

 * gcc.target/aarch64/atomic-inst-cas.c: New.
 * gcc.target/aarch64/atomic-inst-ops.inc: New.



Re: [PATCH] Don't create superfluous parm in expand_omp_taskreg

2015-08-12 Thread Richard Biener
On Tue, 11 Aug 2015, Tom de Vries wrote:

 [ was: Re: [committed, gomp4] Fix release_dangling_ssa_names ]
 
 On 05/08/15 13:13, Richard Biener wrote:
  On Wed, 5 Aug 2015, Tom de Vries wrote:
  
   On 05/08/15 11:30, Richard Biener wrote:
On Wed, 5 Aug 2015, Tom de Vries wrote:

 On 05/08/15 09:29, Richard Biener wrote:
   This patch fixes that by making sure we reset the def stmt to
   NULL.
   This
   means
we can simplify release_dangling_ssa_names to just test for NULL
def
   stmts.
  Not sure if I understand the problem correctly but why are you not
  simply
  releasing the SSA name when you remove its definition?
 
 In move_sese_region_to_fn we move a region of blocks from one function
 to
 another, bit by bit.
 
 When we encounter an ssa_name as def or use in the region, we:
 - generate a new ssa_name,
 - set the def stmt of the old name as def stmt of the new name, and
 - add a mapping from the old to the new name.
 The next time we encounter the same ssa_name in another statement, we
 find
 it
 in the map.
 
 If we release the old ssa name, we effectively create statements with
 operands
 in the free-list. The first point where that cause breakage, is in
 walk_gimple_op, which expects the TREE_TYPE of the lhs of an assign to
 be
 defined, which is not the case if it's in the free-list:
 ...
 case GIMPLE_ASSIGN:
 /* Walk the RHS operands.  If the LHS is of a non-renamable type
 or
is a register variable, we may use a COMPONENT_REF on the
 RHS.*/
 if (wi)
   {
 tree lhs = gimple_assign_lhs (stmt);
 wi-val_only
   = (is_gimple_reg_type (TREE_TYPE (lhs))  !is_gimple_reg
 (lhs))
  || gimple_assign_rhs_class (stmt) != GIMPLE_SINGLE_RHS;
   }
 ...

Hmm, ok, probably because the stmt moving doesn't happen in DOM
order (move defs before uses).  But

   
   There seems to be similar code for the rhs, so I don't think changing the
   order would fix anything.
   
+
+  if (!SSA_NAME_IS_DEFAULT_DEF (name))
+   /* The statement has been moved to the child function.  It no
longer
+  defines name in the original function.  Mark the def stmt
NULL,
and
+  let release_dangling_ssa_names deal with it.  */
+   SSA_NAME_DEF_STMT (name) = NULL;

applies also to uses - I don't see why it couldn't happen that you
move a use but not its def (the def would be a parameter to the
split-out function).  You'd wreck the IL of the source function this
way.

   
   If you first move a use, you create a mapping. When you encounter the def,
   you
   use the mapping. Indeed, if the def is a default def, we don't encounter
   the
   def. Which is why we create a nop as defining def for those cases. The
   default
   def in the source function still has a defining nop, and has no uses
   anymore.
   I don't understand what is broken here.
  
  If you never encounter the DEF then it's broken.  Say, if for
  
  foo(int a)
  {
 int b = a;
 if (b)
   {
  code using b 
   }
  }
  
  you move  code using b  to a function.  Then the def is still in
  foo but you create a mapping for its use(s).  Clearly the outlining
  process in this case has to pass b as parameter to the outlined
  function, something that may not happen currently.
  
 
 Ah, I see. Indeed, this is a situation that is assumed not to happen.
 
  It would probably be cleaner to separate the def and use remapping
  to separate functions and record on whether we saw a def or not.
  
 
 Right, or some other means to detect this situation, say when copying the def
 stmt in replace_ssa_name, check whether it's part of the sese region.
 
I think that the whole dance of actually moving things instead of
just copying it isn't worth the extra maintainance (well, if we already
have a machinery duplicating a SESE region to another function - I
suppose gimple_duplicate_sese_region could be trivially changed to
support that).

   
   I'll mention that as todo. For now, I think the fastest way to get a
   working
   version is to fix move_sese_region_to_fn.
  
  Sure.
  
Trunk doesn't have release_dangling_ssa_names it seems
   
   Yep, I only ran into this trouble for the kernels region handling. But I
   don't
   exclude the possibility it could happen for trunk as well.
   
but I think
it belongs to move_sese_region_to_fn and not to omp-low.c
   
   Makes sense indeed.
   
and it
could also just walk the d-vars_map replace_ssa_name fills to
iterate over the removal candidates
   
   Agreed, I suppose in general that's a win over iterating over all the ssa
   names.
   
(and if the situation of
moving uses but not defs cannot happen you don't need any

[committed] Better location for atomic bit-fields

2015-08-12 Thread Marek Polacek
A trivial patch to use LOC for atomic bit-fields error messages.
I haven't bothered with a testcase this time.

Bootstrapped/regtested on x86_64-linux, applying to trunk.

2015-08-12  Marek Polacek  pola...@redhat.com

* c-decl.c (grokdeclarator): Call error_at instead of error and pass
LOC to it.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 0d7aa3f..b83c584 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -6233,9 +6233,9 @@ grokdeclarator (const struct c_declarator *declarator,
   if (type_quals  TYPE_QUAL_ATOMIC)
{
  if (name)
-   error (bit-field %qE has atomic type, name);
+   error_at (loc, bit-field %qE has atomic type, name);
  else
-   error (bit-field has atomic type);
+   error_at (loc, bit-field has atomic type);
  type_quals = ~TYPE_QUAL_ATOMIC;
}
 }

Marek


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Earnshaw
On 12/08/15 09:43, Richard Earnshaw wrote:
 On 12/08/15 09:32, Richard Earnshaw wrote:
 On 12/08/15 02:11, Richard Henderson wrote:
 I'm somewhat surprised that the operands to the logicals aren't
 visible at rtl generation time, given all the work done in gimple.
 And failing that, combine has enough REG_EQUAL notes that it ought
 to be able to put things back together and see the simpler pattern.


 We've tried it in the past.  Exposing the individual steps prevents the
 higher-level rtl-based optimizations since they can no-longer deal with
 the complete sub-expression.
 
 Eg. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724
 
 R.
 
And https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65768

R.


Re: [RFC] Add check_effective_target_vect_min_max

2015-08-12 Thread Rainer Orth
Tom de Vries tom_devr...@mentor.com writes:

 On 12/08/15 10:51, Rainer Orth wrote:
 Tom de Vries tom_devr...@mentor.com writes:

 This follow-up patch introduces a new effective target vect_min_max,
 similar to how effective target vect_bswap is implemented.

 Any comments?

 Thanks,
 - Tom

 Add check_effective_target_vect_min_max

 2015-08-12  Tom de Vries  t...@codesourcery.com

 * lib/target-supports.exp (check_effective_target_vect_min_max): New
 proc.
 * gcc.dg/vect/trapv-vect-reduc-4.c: Use vect_min_max effective target.

 Looks good to me, but the new effective-target keyword needs documenting
 in sourcebuild.texi.


 Hmm, in sourcebuild.texi I found:
 ...
 @item vect_no_int_max
 Target does not support a vector max instruction on @code{int}.
 ...

 That looks related. [ I also found a patch introducing vect_no_uint_max
 here: https://gcc.gnu.org/ml/gcc-patches/2010-01/msg00152.html. ]

 I'm not sure where to take it from here. Should I introduce
 vect_no_int_min, and use that in combination with vect_no_int_max?

I'd say this is something for the vectorizer maintainers to decide.  Richi?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[Aarch64][2/3] Use the atomic compare-and-swap instructions when available.

2015-08-12 Thread Matthew Wahab

ARMv8.1 adds instructions for atomic compare-and-swap with optional memory
ordering specifiers. This patch adds the compare-and-swap instructions and
changes the atomic_compare_and_swap patterns to use them

The changes to the atomic_compare_and_swap pattern makes the new
instructions available as an alternative for the existing expander, to
be used when the target supports them. The existing expander is reused
so that it can generate code needed to meet the requirements of the
atomic_compare_and_swap name.

Using the atomic CAS instructions, the code generated for a call to
__atomic_compare_exchange (ptr, expected, desired, weak, smodel, fmodel)
becomes:

  mov r, r1
  cassmosz r, r2, [r0]
  cmp r, r1
  cset r0, eq
  cbnz r0, L
  strb r, [r1]
L:
  ret

where
  r0 = ptr, r1 = *expected, r2 = *desired,
  r is some temporary.
  mo is one of {'', 'a', 'l', 'al'}, depending on smodel
  sz is one of {'', 'b', 'h'} depending on the data size.

Tested the series for aarch64-none-linux-gnu with native bootstrap and make
check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
ARMv8.1.

Ok for trunk?
Matthew

2015-08-12  Matthew Wahab  matthew.wa...@arm.com

* config/aarch64/aarch64-protos.h
(aarch64_gen_atomic_cas): Declare.
* config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
Choose appropriate instruction pattern for the target.
(aarch64_gen_atomic_cas): New.
* config/aarch64/atomics.md (UNSPECV_ATOMIC_CAS): New.
(atomic_compare_and_swapmode_1): Rename to
aarch64_compare_and_swapmode.  Fix some indentation.
(aarch64_compare_and_swapmode_lse): New.
(aarch64_atomic_casmode): New.

From a84d8f8202a30fca18ed79e617d5ba3422eb021f Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Mon, 10 Aug 2015 16:59:18 +0100
Subject: [PATCH 2/3] Add and use atomic CAS instruction.

Change-Id: Ie40f345d414fc9dc6c8cbac0eb9457547aa9ec2d
---
 gcc/config/aarch64/aarch64-protos.h |   1 +
 gcc/config/aarch64/aarch64.c|  66 ++--
 gcc/config/aarch64/atomics.md   | 117 +---
 3 files changed, 172 insertions(+), 12 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 32b5d09..0b09d49 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -362,6 +362,7 @@ rtx aarch64_load_tp (rtx);
 
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
+void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 6da7245..259e049 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10749,7 +10749,23 @@ aarch64_expand_compare_and_swap (rtx operands[])
 {
   rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
   machine_mode mode, cmp_mode;
-  rtx (*gen) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*gen_cas_fn) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  int idx;
+  gen_cas_fn gen;
+  const gen_cas_fn split_cas[] =
+  {
+gen_aarch64_compare_and_swapqi,
+gen_aarch64_compare_and_swaphi,
+gen_aarch64_compare_and_swapsi,
+gen_aarch64_compare_and_swapdi
+  };
+  const gen_cas_fn atomic_cas[] =
+  {
+gen_aarch64_compare_and_swapqi_lse,
+gen_aarch64_compare_and_swaphi_lse,
+gen_aarch64_compare_and_swapsi_lse,
+gen_aarch64_compare_and_swapdi_lse
+  };
 
   bval = operands[0];
   rval = operands[1];
@@ -10794,13 +10810,17 @@ aarch64_expand_compare_and_swap (rtx operands[])
 
   switch (mode)
 {
-case QImode: gen = gen_atomic_compare_and_swapqi_1; break;
-case HImode: gen = gen_atomic_compare_and_swaphi_1; break;
-case SImode: gen = gen_atomic_compare_and_swapsi_1; break;
-case DImode: gen = gen_atomic_compare_and_swapdi_1; break;
+case QImode: idx = 0; break;
+case HImode: idx = 1; break;
+case SImode: idx = 2; break;
+case DImode: idx = 3; break;
 default:
   gcc_unreachable ();
 }
+  if (TARGET_LSE)
+gen = atomic_cas[idx];
+  else
+gen = split_cas[idx];
 
   emit_insn (gen (rval, mem, oldval, newval, is_weak, mod_s, mod_f));
 
@@ -10829,6 +10849,42 @@ aarch64_emit_post_barrier (enum memmodel model)
 }
 }
 
+/* Emit an atomic compare-and-swap operation.  RVAL is the destination register
+   for the data in memory.  EXPECTED is the value expected to be in memory.
+   DESIRED is the value to store to memory.  MEM is the memory location.  MODEL
+   is the memory ordering to use.  */
+
+void
+aarch64_gen_atomic_cas (rtx rval, rtx mem,
+			rtx expected, rtx desired,
+			rtx model)
+{
+  rtx (*gen) (rtx, rtx, rtx, rtx);
+ 

[patch] Update libstdc++ status docs and contribution policies.

2015-08-12 Thread Jonathan Wakely

Committed to trunk.
commit 099d855a8ad7b655662e8f9c36cfb998541f501d
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue Jul 21 13:26:07 2015 +0100

	* doc/xml/manual/appendix_contributing.xml: Update patch email policy.
	* doc/xml/manual/status_cxx2017.xml: Update status table.
	* doc/html/manual/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index 2c3b9fb..d7df13c 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -184,7 +184,8 @@
 	  When you have all these pieces, bundle them up in a
 	  mail message and send it to libstd...@gcc.gnu.org. All
 	  patches and related discussion should be sent to the
-	  libstdc++ mailing list.
+	  libstdc++ mailing list. In common with the rest of GCC,
+	  patches should also be sent to the gcc-patches mailing list.
 	/para
   /listitem
 /itemizedlist
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index c0bca84..b625364 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -81,14 +81,13 @@ not in any particular release.
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4089.pdf;
 	  N4089
 	/link
   /entry
   entrySafe conversions in codeunique_ptrlt;T[]gt;/code /entry
-  entryN/entry
+  entryY/entry
   entry/
 /row
 
@@ -116,14 +115,13 @@ not in any particular release.
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4279.html;
 	  N4279
 	/link
   /entry
   entryImproved insertion interface for unique-key maps/entry
-  entryN/entry
+  entryY/entry
   entry/
 /row
 
@@ -147,32 +145,30 @@ not in any particular release.
   /entry
   entryA proposal to add invoke function template/entry
   entryN/entry
-  entry/
+  entryIn progress/entry
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4280.pdf;
 	  N4280
 	/link
   /entry
   entryNon-member size() and more/entry
-  entryN/entry
+  entryY/entry
   entry/
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4366.html;
-	  N4387
+	  N4366
 	/link
   /entry
   entry
 LWG 2228: Missing SFINAE rule in unique_ptr templated assignment
   /entry
-  entryN/entry
+  entryY/entry
   entry/
 /row
 
@@ -314,14 +310,13 @@ not in any particular release.
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4391.html;
 	  N4391
 	/link
   /entry
   entry codemake_array/code, revision 4 /entry
-  entryN/entry
+  entryY (not committed yet)/entry
   entryLibrary Fundamentals 2 TS/entry
 /row
 


RE: [PATCH][MIPS] Scheduler fix for the 74k 24k.

2015-08-12 Thread Robert Suchanek
Hi,

  Simon
 
  gcc/
  * config/mips/mips.c (mips_store_data_bypass_p): Bring code into
  line with comments.
  * config/mips/sb1.md: Update usage of mips_store_data_bypass_p.
 
 
 This patch is OK.

Committed on Simon's behalf as r226805.

Regards,
Robert


Re: [Aarch64][3/3] Add tests for atomic compare-and-swap instructions.

2015-08-12 Thread Matthew Wahab

On 12/08/15 11:51, Andrew Pinski wrote:

On Wed, Aug 12, 2015 at 6:16 PM, Matthew Wahab
matthew.wa...@foss.arm.com wrote:

ARMv8.1 adds instructions for atomic compare-and-swap with optional
memory ordering specifiers. This patch adds tests for the
compare-and-swap instructions as two files. The first is support code to
run the test with a range of types and memory models, the second is the
test for the CAS instruction.


I noticed you did not change the other cas testcases so they don't set
lse.  This can cause those testcases to fail if a cpu defaults to
having LSE.
I am going to be modifying a cpu to default to LSE enabled soonish.



I'll see what I can do.
Matthew


[Aarch64][3/3] Add tests for atomic compare-and-swap instructions.

2015-08-12 Thread Matthew Wahab

ARMv8.1 adds instructions for atomic compare-and-swap with optional
memory ordering specifiers. This patch adds tests for the
compare-and-swap instructions as two files. The first is support code to
run the test with a range of types and memory models, the second is the
test for the CAS instruction.

Tested the series for aarch64-none-linux-gnu with native bootstrap and make
check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
ARMv8.1.

Ok for trunk?
Matthew

gcc/testsuite
2015-08-12  Matthew Wahab  matthew.wa...@arm.com

* gcc.target/aarch64/atomic-inst-cas.c: New.
* gcc.target/aarch64/atomic-inst-ops.inc: New.

From c72302f2a0bc4d95a0b933e54332d551295040bf Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Mon, 3 Aug 2015 18:10:37 +0100
Subject: [PATCH 3/3] Add tests for CAS instruction.

Change-Id: I42a46c2f81f1200a893620ba96323ce785873e8d
---
 gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c | 61 ++
 .../gcc.target/aarch64/atomic-inst-ops.inc | 53 +++
 2 files changed, 114 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc

diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
new file mode 100644
index 000..f6f2892
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -march=armv8-a+lse } */
+
+/* Test ARMv8.1-A CAS instruction.  */
+
+#include atomic-inst-ops.inc
+
+#define TEST TEST_TWO
+
+#define CAS_ATOMIC(FN, TY, MODEL1, MODEL2)\
+  int FNNAME (FN, TY) (TY* val, TY* foo, TY* bar)			\
+  {	\
+int model_s = MODEL1;		\
+int model_f = MODEL2;		\
+/* The success memory ordering must be at least as strong as	\
+   the failure memory ordering.  */	\
+if (model_s  model_f)		\
+  return 0;\
+/* Ignore invalid memory orderings.  */\
+if (model_f == __ATOMIC_RELEASE || model_f == __ATOMIC_ACQ_REL)	\
+  return 0;\
+return __atomic_compare_exchange_n (val, foo, bar, 0, model_s, model_f); \
+  }
+
+#define CAS_ATOMIC_NORETURN(FN, TY, MODEL1, MODEL2)			\
+  void FNNAME (FN, TY) (TY* val, TY* foo, TY* bar)			\
+  {	\
+int model_s = MODEL1;		\
+int model_f = MODEL2;		\
+/* The success memory ordering must be at least as strong as	\
+   the failure memory ordering.  */	\
+if (model_s  model_f)		\
+  return;\
+/* Ignore invalid memory orderings.  */\
+if (model_f == __ATOMIC_RELEASE || model_f == __ATOMIC_ACQ_REL)	\
+  return;\
+__atomic_compare_exchange_n (val, foo, bar, 0, model_s, model_f);	\
+  }
+
+TEST (cas_atomic, CAS_ATOMIC)
+TEST (cas_atomic_noreturn, CAS_ATOMIC_NORETURN)
+
+
+/* { dg-final { scan-assembler-times casb\t 4} } */
+/* { dg-final { scan-assembler-times casab\t 20} } */
+/* { dg-final { scan-assembler-times caslb\t 4} } */
+/* { dg-final { scan-assembler-times casalb\t 36} } */
+
+/* { dg-final { scan-assembler-times cash\t 4} } */
+/* { dg-final { scan-assembler-times casah\t 20} } */
+/* { dg-final { scan-assembler-times caslh\t 4} } */
+/* { dg-final { scan-assembler-times casalh\t 36} } */
+
+/* { dg-final { scan-assembler-times cas\t 8} } */
+/* { dg-final { scan-assembler-times casa\t 40} } */
+/* { dg-final { scan-assembler-times casl\t 8} } */
+/* { dg-final { scan-assembler-times casal\t 72} } */
+
+/* { dg-final { scan-assembler-not ldaxr\t } } */
+/* { dg-final { scan-assembler-not stlxr\t } } */
+/* { dg-final { scan-assembler-not dmb } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc
new file mode 100644
index 000..72c7e5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc
@@ -0,0 +1,53 @@
+/* Support code for atomic instruction tests.  */
+
+/* Define types names without spaces.  */
+typedef unsigned char uchar;
+typedef unsigned short ushort;
+typedef unsigned int uint;
+typedef long long longlong;
+typedef unsigned long long ulonglong;
+typedef __int128_t int128;
+typedef __uint128_t uint128;
+
+#define FNNAME(NAME,TY) NAME
+
+/* Expand one-model functions.  */
+#define TEST_M1(NAME, FN, TY, MODEL, DUMMY)	\
+  FN (test_##NAME##_##TY, TY, MODEL)
+
+/* Expand two-model functions.  */
+#define TEST_M2(NAME, FN, TY, MODEL1, MODEL2)	\
+  FN (test_##NAME##_##TY, TY, MODEL1, MODEL2)
+
+/* Typest to test.  */
+#define TEST_TY(NAME, FN, N, MODEL1, MODEL2)		\
+  TEST_M##N (NAME, FN, char, MODEL1, MODEL2)		\
+  TEST_M##N (NAME, FN, uchar, MODEL1, MODEL2)		\
+  TEST_M##N (NAME, FN, short, MODEL1, MODEL2)		\
+  TEST_M##N (NAME, FN, ushort, MODEL1, MODEL2)		\
+  TEST_M##N (NAME, FN, int, 

Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Richard Biener
On Tue, Aug 11, 2015 at 9:49 PM, Jeff Law l...@redhat.com wrote:
 On 08/06/2015 04:25 AM, Mikael Morin wrote:

 Hello,

 this avoids an error found with bootstrap-ubsan.
 Regression tested on x86_64-unknown-linux-gnu.  OK for trunk?

 Mikael


 noub_sext.CL


 2015-08-05  Mikael Morinmik...@gcc.gnu.org

 * hwint.h (sext_hwi): Rewrite without undefined behaviour on
 negative SRC.

 OK.  Hopefully most of the time the precision is known at compile-time which
 would allow for optimization of the resulting code back to the
 pair-of-shifts form by combine.

I think it is not.  The code also lacks a comment on why we do this kind
of obfuscation.

What kind of error does ubsan run into?  That is, for which 'prec'?

Richard.




 jeff



Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Markus Trippelsdorf
On 2015.08.12 at 13:01 +0200, Richard Biener wrote:
 On Tue, Aug 11, 2015 at 9:49 PM, Jeff Law l...@redhat.com wrote:
  On 08/06/2015 04:25 AM, Mikael Morin wrote:
 
  Hello,
 
  this avoids an error found with bootstrap-ubsan.
  Regression tested on x86_64-unknown-linux-gnu.  OK for trunk?
 
  Mikael
 
 
  noub_sext.CL
 
 
  2015-08-05  Mikael Morinmik...@gcc.gnu.org
 
  * hwint.h (sext_hwi): Rewrite without undefined behaviour on
  negative SRC.
 
  OK.  Hopefully most of the time the precision is known at compile-time which
  would allow for optimization of the resulting code back to the
  pair-of-shifts form by combine.
 
 I think it is not.  The code also lacks a comment on why we do this kind
 of obfuscation.
 
 What kind of error does ubsan run into?  That is, for which 'prec'?

See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67042

-- 
Markus


Re: [Aarch64][2/3] Use the atomic compare-and-swap instructions when available.

2015-08-12 Thread James Greenhalgh
On Wed, Aug 12, 2015 at 11:15:25AM +0100, Matthew Wahab wrote:
 ARMv8.1 adds instructions for atomic compare-and-swap with optional memory
 ordering specifiers. This patch adds the compare-and-swap instructions and
 changes the atomic_compare_and_swap patterns to use them
 
 The changes to the atomic_compare_and_swap pattern makes the new
 instructions available as an alternative for the existing expander, to
 be used when the target supports them. The existing expander is reused
 so that it can generate code needed to meet the requirements of the
 atomic_compare_and_swap name.
 
 Using the atomic CAS instructions, the code generated for a call to
 __atomic_compare_exchange (ptr, expected, desired, weak, smodel, fmodel)
 becomes:
 
mov r, r1
cassmosz r, r2, [r0]
cmp r, r1
cset r0, eq
cbnz r0, L
strb r, [r1]
 L:
ret
 
 where
r0 = ptr, r1 = *expected, r2 = *desired,
r is some temporary.
mo is one of {'', 'a', 'l', 'al'}, depending on smodel
sz is one of {'', 'b', 'h'} depending on the data size.
 
 Tested the series for aarch64-none-linux-gnu with native bootstrap and make
 check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
 aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
 ARMv8.1.
 
 Ok for trunk?
 Matthew

OK, but please fix up the testcases Andrew mentioned for those configuring
their toolchains/test runs with a default -mcpu or -march which targets
these instructions. I would just hardwire them
(with dg-additional-options) to -march=armv8-a.

Thanks,
James

 2015-08-12  Matthew Wahab  matthew.wa...@arm.com
 
   * config/aarch64/aarch64-protos.h
   (aarch64_gen_atomic_cas): Declare.
   * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
   Choose appropriate instruction pattern for the target.
   (aarch64_gen_atomic_cas): New.
   * config/aarch64/atomics.md (UNSPECV_ATOMIC_CAS): New.
   (atomic_compare_and_swapmode_1): Rename to
   aarch64_compare_and_swapmode.  Fix some indentation.
   (aarch64_compare_and_swapmode_lse): New.
   (aarch64_atomic_casmode): New.
 



Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Wilco Dijkstra
Richard Henderson wrote:
 However, the way that aarch64 and alpha have done it hasn't
 been ideal, in that there's a fairly costly search that must
 be done every time.  I've thought before about changing this
 so that we would be able to cache results, akin to how we do
 it in expmed.c for multiplication.
 
 I've implemented such a caching scheme for three targets, as
 a test of how much code could be shared.  The answer appears
 to be about 100 lines of boiler-plate.  Minimal, true, but it
 may still be worth it as a way of encouraging backends to do
 similar things in a similar way.

However it also creates new dependencies that may not be desirable
(such as hash table size, algorithm used etc).

 Some notes about ppc64 in particular:
 
   * Constants aren't split until quite late, preventing all hope of
 CSE'ing portions of the generated code.  My gut feeling is that
 this is in general a mistake, but...

Late split is best in general as you want to CSE the original constants,
not parts of the expansion (which would be very rarely possible).

   * This is the only platform for which I bothered collecting any sort
 of performance data:
 
 As best I can tell, there is a 9% improvement in bootstrap speed
 for ppc64.  That is, 10 minutes off the original 109 minute build.
 
 For aarch64 and alpha, I simply assumed there would be no loss,
 since the basic search algorithm is unchanged for each.
 
 Comments?  Especially on the shared header?

I'm not convinced the amount of code that could be shared is enough to be
worthwhile. Also the way it is written makes the immediate generation more 
complex and likely consuming a lot of memory (each cached immediate requires
at least 64 bytes). It is not obvious to me why it is a good idea to hide the
simple/fast cases behind the hashing scheme - it seems better that the backend
explicitly decides which cases should be cached.

I looked at the statistics of AArch64 immediate generation a while ago. 
The interesting thing is ~95% of calls are queries, and the same query is on 
average repeated 10 times in a row. So (a) it is not important to cache the 
expansions, and (b) the high repetition rate means a single-entry cache
has a 90% hitrate. We already have a patch for this and could collect stats
comparing the approaches. If a single-entry cache can provide a similar 
benefit as caching all immediates then my preference would be to keep things
simple and just cache the last query.

Note the many repeated queries indicate a performance issue at a much higher 
level (repeated cost queries on the same unchanged RTL), and solving that 
problem will likely improve buildtime for all targets.

Wilco




Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Tom Tromey
Jeff In the past this has stalled on issues like how will asynch-exceptions
Jeff be tested and the like.

It seems to me that either there is some other language which needs this
-- in which case that language ought to have testing for the feature --
or the feature is only used by gcj, in which case it doesn't matter.

Of course is!=ought; but relying on gcj and libjava to provide this
small amount of testing seems like a bad cost/benefit tradeoff.

Tom


Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Jeff Law

On 08/12/2015 10:24 AM, Ian Lance Taylor wrote:

On Wed, Aug 12, 2015 at 9:21 AM, Tom Tromey t...@tromey.com wrote:

Jeff In the past this has stalled on issues like how will asynch-exceptions
Jeff be tested and the like.

It seems to me that either there is some other language which needs this
-- in which case that language ought to have testing for the feature --
or the feature is only used by gcj, in which case it doesn't matter.

Of course is!=ought; but relying on gcj and libjava to provide this
small amount of testing seems like a bad cost/benefit tradeoff.


Go does use asynchronous exceptions, and has test cases that rely on
them working.
If you're comfortable with Go at this point and we have mechanisms in 
place to ensure Go only gets built on platforms that support Go, then I 
think we should go forward with replacing GCJ with Go.


Jeff


Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Ian Lance Taylor
On Wed, Aug 12, 2015 at 7:57 AM, Andrew Haley a...@redhat.com wrote:
 On 12/08/15 15:44, Jeff Law wrote:
 My inclination is to replace GCJ with Go, but Ian wasn't comfortable
 with that when I suggested it a couple years ago.

 Because Go wasn't ready for prime time?

I don't remember why I wasn't comfortable with Go a couple of years
ago, but I think it would be OK as a default language today.  I doubt
it runs on as many platforms as gcj, though I would certainly be happy
to see more portability patches.  In particular, it doesn't currently
run on Windows or Darwin.

Ian


Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Ian Lance Taylor
On Wed, Aug 12, 2015 at 9:21 AM, Tom Tromey t...@tromey.com wrote:
 Jeff In the past this has stalled on issues like how will asynch-exceptions
 Jeff be tested and the like.

 It seems to me that either there is some other language which needs this
 -- in which case that language ought to have testing for the feature --
 or the feature is only used by gcj, in which case it doesn't matter.

 Of course is!=ought; but relying on gcj and libjava to provide this
 small amount of testing seems like a bad cost/benefit tradeoff.

Go does use asynchronous exceptions, and has test cases that rely on
them working.

Ian


Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Jeff Law

On 08/12/2015 07:33 AM, Mikael Morin wrote:

Le 12/08/2015 13:09, Richard Biener a écrit :

On Wed, Aug 12, 2015 at 1:07 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:

On 2015.08.12 at 13:01 +0200, Richard Biener wrote:

What kind of error does ubsan run into?  That is, for which 'prec'?


See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67042


Ugh.  Stupid C.

My fear is that with so many stmts sext_hwi isn't any longer
considered for
inlining, even if prec is constant.  What's the generated code for its
out-of line
copy with/without the patch?



The difference on x86_64 is:

.cfi_startproc
cmpl$64, %esi
je.L3
-movl$64, %ecx
-subl%esi, %ecx
-salq%cl, %rdi
+leal-1(%rsi), %ecx
+movl$1, %eax
+movq%rax, %rdx
+salq%cl, %rdx
+movl%esi, %ecx
+salq%cl, %rax
+subq$1, %rax
+andq%rax, %rdi
+xorq%rdx, %rdi
movq%rdi, %rax
-sarq%cl, %rax
+subq%rdx, %rax
ret
.p2align 4,,10
.p2align 3

with -O2, tests attached.
Yes it's worse.  I thought it was better to avoid undefined behaviour at
all prices to avoid bad surprises.
Well, this isn't the right test.  You're testing when PREC is an unknown 
and I fully expect in that case the code you're going to get will be 
worse.  There's too many insns for combine to save us in that case.


What's interesting here is what happens when prec is a known value 
(since we're hoping that's the common case via inlining).


When prec has a known value of 8, 16 or 32 we get a nice movs[bwl]q on 
x86_64, which is exactly what we want.


When prec is another known value, say 13 for the sake of argument, we 
get a pair of shifts, which is again exactly what we want.


So when prec is a constant, we're going to get good code.  So the only 
question left is whether or not prec is usually a constant or not in the 
contexts in which this routine is used.



Jeff


Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation

2015-08-12 Thread Richard Henderson
On 08/12/2015 08:59 AM, Wilco Dijkstra wrote:
 I looked at the statistics of AArch64 immediate generation a while ago. 
 The interesting thing is ~95% of calls are queries, and the same query is on 
 average repeated 10 times in a row. So (a) it is not important to cache the 
 expansions, and (b) the high repetition rate means a single-entry cache
 has a 90% hitrate. We already have a patch for this and could collect stats
 comparing the approaches. If a single-entry cache can provide a similar 
 benefit as caching all immediates then my preference would be to keep things
 simple and just cache the last query.

Interesting.  That's already more detailed investigation than I'd done.  I had
no idea the queries were so clustered.  I assumed that the queries would be
scattered across various passes, and so the various constants across the
function would get checked in sequence.

I would be very interested in seeing those stats when you've done.


r~


[PATCH][RTL-ifcvt] Allow PLUS+immediate expression in noce_try_store_flag_constants

2015-08-12 Thread Kyrill Tkachov

Hi all,

This patch is a sequel to:
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02567.html

It allows if-conversion of expressions of the form:
if (test) x := y + c1; else x := y + c2
where c1 and c2 are a pair constants that can be optimised using the
existing rules in noce_try_store_flag_constants.

The resulting sequence looks something like:
x := (test [!=,==] 0) + y;
x := x + [c1,c2];

This patch reuses the logic for the combinations of diff and STORE_FLAG signs 
that
is hammered out in https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02567.html.

That means we just extract the constants c1 and c2 from the PLUS rtx
and use their diff as the driver for the logic in that function.

Currently, this patch allows only the case where
diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE
as it is the most beneficial case (from looking at SPEC2006 on aarch64)
and also the least tricky to get right.
In the future I suppose that restriction can be relaxed if needed, but in any 
case
this should be an improvement over the existing situation.

On aarch64 this allows us for code:
int
barsi (int x)
{
  return x  100 ? x + 4 : x + 3;
}

to generate:
barsi:
cmp w0, 101
cincw0, w0, ge
add w0, w0, 3
ret

instead of the current:
barsi:
add w1, w0, 4
add w2, w0, 3
cmp w0, 101
cselw0, w2, w1, lt
ret

thus saving one instruction and using two fewer registers.

Bootstrapped and tested on arm, aarch64, x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-08-12  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* ifcvt.c (noce_try_store_flag_constants): Handle PLUS-immediate
expressions in A and B.

2015-08-12  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/cinc_common_1.c: New test.
commit ce0de8d30962bf00339dc3a030401a4be2a7a248
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Jul 30 15:14:27 2015 +0100

[ifcvt][unfinished] Allow PLUS+immediate expression in noce_try_store_flag_constants

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index f815166..7a44e29 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1198,17 +1198,35 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   HOST_WIDE_INT itrue, ifalse, diff, tmp;
   int normalize;
   bool can_reverse;
-  machine_mode mode;
+  machine_mode mode = GET_MODE (if_info-x);;
+  rtx common = NULL_RTX;
 
   if (!noce_simple_bbs (if_info))
 return FALSE;
 
-  if (CONST_INT_P (if_info-a)
-   CONST_INT_P (if_info-b))
+  rtx a = if_info-a;
+  rtx b = if_info-b;
+
+  /* Handle cases like x := test ? y + 3 : y + 4.  */
+  if (GET_CODE (a) == PLUS
+   GET_CODE (b) == PLUS
+   CONST_INT_P (XEXP (a, 1))
+   CONST_INT_P (XEXP (b, 1))
+   rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
+   noce_operand_ok (XEXP (a, 0))
+   if_info-branch_cost = 2)
 {
-  mode = GET_MODE (if_info-x);
-  ifalse = INTVAL (if_info-a);
-  itrue = INTVAL (if_info-b);
+  common = XEXP (a, 0);
+
+  a = XEXP (a, 1);
+  b = XEXP (b, 1);
+}
+
+  if (CONST_INT_P (a)
+   CONST_INT_P (b))
+{
+  ifalse = INTVAL (a);
+  itrue = INTVAL (b);
   bool subtract_flag_p = false;
 
   diff = (unsigned HOST_WIDE_INT) itrue - ifalse;
@@ -1241,6 +1259,11 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	{
 	  reversep = can_reverse;
 	  subtract_flag_p = !can_reverse;
+	  /* If we need to subtract the flag and we have PLUS-immediate
+		 A and B then it is unlikely to be beneficial to play tricks
+		 here.  */
+	  if (subtract_flag_p  common)
+		return FALSE;
 	}
 	  /* test ? 3 : 4
 	 = can_reverse  | 3 + (test == 0)
@@ -1249,6 +1272,11 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	{
 	  reversep = can_reverse;
 	  subtract_flag_p = !can_reverse;
+	  /* If we need to subtract the flag and we have PLUS-immediate
+		 A and B then it is unlikely to be beneficial to play tricks
+		 here.  */
+	  if (subtract_flag_p  common)
+		return FALSE;
 	}
 	  /* test ? 4 : 3
 	 = 4 + (test != 0).  */
@@ -1290,6 +1318,15 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	}
 
   start_sequence ();
+
+  /* If we have x := test ? x + 3 : x + 4 then move the original
+	 x out of the way while we store flags.  */
+  if (common  rtx_equal_p (common, if_info-x))
+	{
+	  common = gen_reg_rtx (mode);
+	  noce_emit_move_insn (common, if_info-x);
+	}
+
   target = noce_emit_store_flag (if_info, if_info-x, reversep, normalize);
   if (! target)
 	{
@@ -1301,13 +1338,27 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	 =   x = 3 + (test == 0);  */
   if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
 	{
+	  /* Add the common part now.  This may allow combine to merge this
+	 with the store flag operation earlier into some sort of conditional
+	 increment/decrement if the target allows it.  */
+	  if 

Re: [C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Jeff Law

On 08/12/2015 09:39 AM, Marek Polacek wrote:

This patch fixes a defect in -Wshift-overflow.  We should only warn
about left-shifting 1 into the sign bit when -Wshift-overflow=2.  But
this doesn't apply only for 1  31, but also for 2  30, etc.
In C++14, never warn about this.

Neither existing tests nor documentation require updating, I think.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-08-12  Marek Polacek  pola...@redhat.com

PR c++/55095
* c-common.c (maybe_warn_shift_overflow): Properly handle
left-shifting 1 into the sign bit.

* c-c++-common/Wshift-overflow-6.c: New test.
* c-c++-common/Wshift-overflow-7.c: New test.
* g++.dg/cpp1y/left-shift-2.C: New test.
I didn't realize C++14 fixed this bit of lameness.  I'd read that it was 
under consideration when I was looking at Mikael's sext_hwi patch, but 
didn't follow up on the current status.


OK for the trunk.

jeff


Re: [PATCH, libjava/classpath]: Fix overriding recipe for target 'gjdoc' build warning

2015-08-12 Thread Ian Lance Taylor
On Wed, Aug 12, 2015 at 9:47 AM, Jeff Law l...@redhat.com wrote:

 If you're comfortable with Go at this point and we have mechanisms in place
 to ensure Go only gets built on platforms that support Go, then I think we
 should go forward with replacing GCJ with Go.

We have the mechanism for disabling Go on systems where it will not
work.  However, the list of such systems is undoubtedly incomplete at
this time.

In the top level configure.ac:

# Disable the go frontend on systems where it is known to not work. Please keep
# this in sync with contrib/config-list.mk.
case ${target} in
*-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*)
unsupported_languages=$unsupported_languages go
;;
esac

# Disable libgo for some systems where it is known to not work.
# For testing, you can easily override this with --enable-libgo.
if test x$enable_libgo = x; then
case ${target} in
*-*-darwin*)
# PR 46986
noconfigdirs=$noconfigdirs target-libgo
;;
*-*-cygwin* | *-*-mingw*)
noconfigdirs=$noconfigdirs target-libgo
;;
*-*-aix*)
noconfigdirs=$noconfigdirs target-libgo
;;
esac
fi


Re: [PATCH 06/15] rs6000: Use rldiwi in constant construction

2015-08-12 Thread Richard Henderson
On 08/12/2015 07:02 AM, Segher Boessenkool wrote:
 On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
 @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
if (exam_mask (-1, c, sub_budget))
  return true;
  
 +  /* If the two halves are equal, use an insert.  */
 +  if (c  32 == test  exam_sub (test, sub_budget))
 +{
 +  opN (VEC_DUPLICATE, 0xu); /* RLDIMI */
 +  return true;
 +}
 
 Does this work for c with the high bit set?  I think you need
 to cast it to unsigned HOST_WIDE_INT first?

Indeed, a sign-extension works better.  It means the base constant will use
LIS+ORIS without trying to create an unsigned version.

If you're talking about ubsan sort of restrictions on shifting signed
constants...  I choose to totally ignore that.  Certainly no where else in gcc
has been audited for that, beginning with hwint.h itself.


r~


Re: [PATCH 01/15] rs6000: Split out rs6000_is_valid_and_mask_wide

2015-08-12 Thread Richard Henderson
On 08/12/2015 06:23 AM, Segher Boessenkool wrote:
 On Tue, Aug 11, 2015 at 06:11:30PM -0700, Richard Henderson wrote:
 This allows testing for a mask without having to call GEN_INT.

 Cc: David Edelsohn dje@gmail.com
 ---
  * config/rs6000/rs6000.c (rs6000_is_valid_mask_wide): Split out from...
  (rs6000_is_valid_mask): ... here.
  (rs6000_is_valid_and_mask_wide): Split out from...
  (rs6000_is_valid_and_mask): ... here.
 
 I don't like these _wide names much.

It follows the existing practice within the backend.

  You could overload the shorter
 name, if you really think creating some garbage const_int's is too much
 overhead (it might well be if you use it a lot more in later patches).

At one stage in the development (before I became much leaner with the search
for rotate), it really really mattered.

 -bool
 -rs6000_is_valid_mask (rtx mask, int *b, int *e, machine_mode mode)
 +static bool
 +rs6000_is_valid_mask_wide (unsigned HOST_WIDE_INT val, int *b, int *e, int 
 n)
 
 But why change the mode parameter?  The code was clearer before.

So that we don't have to look up GET_MODE_BITSIZE (mode).

 +static bool
 +rs6000_is_valid_and_mask_wide (unsigned HOST_WIDE_INT val, machine_mode 
 mode)
  {
int nb, ne;
  
 -  if (!rs6000_is_valid_mask (mask, nb, ne, mode))
 -return false;
 +  switch (mode)
 +{
 +case DImode:
 +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 64))
 +return false;
 +  /* For DImode, we need a rldicl, rldicr, or a rlwinm with
 + mask that does not wrap.  */
 +  return (ne == 0 || nb == 63 || (nb  32  ne = nb));
  
 -  /* For DImode, we need a rldicl, rldicr, or a rlwinm with mask that
 - does not wrap.  */
 -  if (mode == DImode)
 -return (ne == 0 || nb == 63 || (nb  32  ne = nb));
 +case SImode:
 +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 32))
 +return false;
 +  /* For SImode, rlwinm can do everything.  */
 +  return (nb  32  ne  32);
  
 -  /* For SImode, rlwinm can do everything.  */
 -  if (mode == SImode)
 -return (nb  32  ne  32);
 +default:
 +  return false;
 +}
 +}
  
 -  return false;
 
 You don't need any of these changes then, either.

True, not *needed* per-se, but if you look closer I'm combining conditionals.
I think the replacement here is clearer.

/* Otherwise, fill in the lowest hole; if we can do the result with
   one insn, we can do the whole thing with two.  */
 -  unsigned HOST_WIDE_INT val = INTVAL (c);
 +  unsigned HOST_WIDE_INT val = UINTVAL (c);
 
 Does it matter?

No.


r~



Re: [PATCH] xtensa: add -mauto-litpools option

2015-08-12 Thread augustine.sterl...@gmail.com
On Tue, Aug 11, 2015 at 6:09 PM, Max Filippov jcmvb...@gmail.com wrote:
 With support from assembler this option allows compiling huge functions,
 where single literal pool at the beginning of a function may not be
 reachable by L32R instructions at its end.

 Currently assembler --auto-litpools option cannot deal with literals
 used from multiple locations separated by more than 256 KBytes of code.
 Don't turn constants into literals, instead use MOVI instruction to load
 them into registers and let the assembler turn them into literals as
 necessary.

 2015-08-11  Max Filippov  jcmvb...@gmail.com
 gcc/
 * config/xtensa/constraints.md (define_constraint Y): New
 constraint.
 * config/xtensa/elf.h (ASM_SPEC): Add m(no-)auto-litpools.
 * config/xtensa/linux.h (ASM_SPEC): Likewise.
 * config/xtensa/predicates.md (move_operand): Match constants
 and symbols in the presence of TARGET_AUTO_LITPOOLS.
 * config/xtensa/xtensa.c (xtensa_valid_move): Don't allow
 immediate references to TLS data.
 (xtensa_emit_move_sequence): Don't force constants to memory in
 the presence of TARGET_AUTO_LITPOOLS.
 (print_operand): Add 'y' format, same as default, but capable of
 printing SF mode constants as well.
 * config/xtensa/xtensa.md (movsi_internal, movhi_internal)
 (movsf_internal): Add movi pattern that loads literal.
 (movsf, movdf): Don't force constants to memory in the presence
 of TARGET_AUTO_LITPOOLS.
 (movdf_internal): Add 'Y' constraint.
 * config/xtensa/xtensa.opt (mauto-litpools): New option.
 * doc/invoke.text (Xtensa options): Document -mauto-litpools.

If this is OK with the linux people, it is OK with me. As I recall,
they used to have a need to keep literals in page-level groups, but my
memory is hazy.


Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Richard Biener
On August 12, 2015 7:02:04 PM GMT+02:00, Jeff Law l...@redhat.com wrote:
On 08/12/2015 07:33 AM, Mikael Morin wrote:
 Le 12/08/2015 13:09, Richard Biener a écrit :
 On Wed, Aug 12, 2015 at 1:07 PM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:
 On 2015.08.12 at 13:01 +0200, Richard Biener wrote:
 What kind of error does ubsan run into?  That is, for which
'prec'?

 See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67042

 Ugh.  Stupid C.

 My fear is that with so many stmts sext_hwi isn't any longer
 considered for
 inlining, even if prec is constant.  What's the generated code for
its
 out-of line
 copy with/without the patch?


 The difference on x86_64 is:

 .cfi_startproc
 cmpl$64, %esi
 je.L3
 -movl$64, %ecx
 -subl%esi, %ecx
 -salq%cl, %rdi
 +leal-1(%rsi), %ecx
 +movl$1, %eax
 +movq%rax, %rdx
 +salq%cl, %rdx
 +movl%esi, %ecx
 +salq%cl, %rax
 +subq$1, %rax
 +andq%rax, %rdi
 +xorq%rdx, %rdi
 movq%rdi, %rax
 -sarq%cl, %rax
 +subq%rdx, %rax
 ret
 .p2align 4,,10
 .p2align 3

 with -O2, tests attached.
 Yes it's worse.  I thought it was better to avoid undefined behaviour
at
 all prices to avoid bad surprises.
Well, this isn't the right test.  You're testing when PREC is an
unknown 
and I fully expect in that case the code you're going to get will be 
worse.  There's too many insns for combine to save us in that case.

What's interesting here is what happens when prec is a known value 
(since we're hoping that's the common case via inlining).

When prec has a known value of 8, 16 or 32 we get a nice movs[bwl]q on 
x86_64, which is exactly what we want.

When prec is another known value, say 13 for the sake of argument, we 
get a pair of shifts, which is again exactly what we want.

So when prec is a constant, we're going to get good code.  So the only 
question left is whether or not prec is usually a constant or not in
the 
contexts in which this routine is used.

Prec is almost never a constant and is heavily used from wide-int.

We are not exploiting this undefined ness in C so I object to making this so 
much slower.

Can we instead do what we do for abs_hwi and add a checking assert so we can 
move the tests to the callers that misbehave instead?

Thanks,
Richard.


Jeff




Re: [gomp4] Work around expand_GOACC_DIM_SIZE/expand_GOACC_DIM_POS ICEs

2015-08-12 Thread Nathan Sidwell

On 08/12/15 08:46, Nathan Sidwell wrote:

On 08/12/15 06:21, Thomas Schwinge wrote:

Hi!

  OpenACC offloading code for Intel MIC doesn't make

a lot of sense (currently), but it is what's being done, and has caused a
lot of regressions in my testing, so I committed the following workaround
to gomp-4_0-branch in r226804.  A different approach would have been to
invent some machinery to not compile OpenACC offloading code on the Intel
MIC offloading path, but as the latter eventually is to support OpenACC
offloading, too (see


The right solution is probably for the default validated_dims to set the size to
unity, just like the host.  Thereby forcing any backend that wants to provide a
larger size to override that hook.

I.e. remove the #ifndef ACCEL_COMPILER from default_validate_dims in omp-low


for avoidance of doubt, I'll add that to the patch I have in progress.

nathan

--
Nathan Sidwell - Director, Sourcery Services - Mentor Embedded


Re: [gomp4] Work around expand_GOACC_DIM_SIZE/expand_GOACC_DIM_POS ICEs

2015-08-12 Thread Nathan Sidwell

On 08/12/15 06:21, Thomas Schwinge wrote:

Hi!

 OpenACC offloading code for Intel MIC doesn't make

a lot of sense (currently), but it is what's being done, and has caused a
lot of regressions in my testing, so I committed the following workaround
to gomp-4_0-branch in r226804.  A different approach would have been to
invent some machinery to not compile OpenACC offloading code on the Intel
MIC offloading path, but as the latter eventually is to support OpenACC
offloading, too (see


The right solution is probably for the default validated_dims to set the size to 
unity, just like the host.  Thereby forcing any backend that wants to provide a 
larger size to override that hook.


I.e. remove the #ifndef ACCEL_COMPILER from default_validate_dims in omp-low

nathan
--
Nathan Sidwell


Re: [Aarch64][3/3] Add tests for atomic compare-and-swap instructions.

2015-08-12 Thread James Greenhalgh
On Wed, Aug 12, 2015 at 11:16:41AM +0100, Matthew Wahab wrote:
 ARMv8.1 adds instructions for atomic compare-and-swap with optional
 memory ordering specifiers. This patch adds tests for the
 compare-and-swap instructions as two files. The first is support code to
 run the test with a range of types and memory models, the second is the
 test for the CAS instruction.
 
 Tested the series for aarch64-none-linux-gnu with native bootstrap and make
 check and for aarch64-none-elf with cross-compiled check-gcc. Also tested
 aarch64-none-elf with cross-compiled check-gcc on an emulator that supports
 ARMv8.1.
 
 Ok for trunk?
 Matthew

OK, but please write a follow-up to fix the pre-existing tests.

Thanks,
James

 
 gcc/testsuite
 2015-08-12  Matthew Wahab  matthew.wa...@arm.com
 
   * gcc.target/aarch64/atomic-inst-cas.c: New.
   * gcc.target/aarch64/atomic-inst-ops.inc: New.
 



Re: [PATCH] PR target/67127: [ARM] Avoiding odd-number ldrd/strd in movdi introduced a regression on armeb-linux-gnueabihf

2015-08-12 Thread Yvan Roux
On 11 August 2015 at 12:28, Ramana Radhakrishnan
ramana.radhakrish...@foss.arm.com wrote:

 Yes in big-endian DI mode value are stored into VFP registers, and
 here register 16 is the first of them s0.  Just in case you want to do
 more test, the issue can be seen with a oneline testcase:

 __attribute__((__vector_size__(2 * sizeof(int int fn1() {}

 Yep we may well have DImode values in the VFP register bank.


 I observe that FIRST_VIRTUAL_REGISTER is 104, whereas LAST_ARM_REG is 15. So
 it might be that the pattern should check against the latter instead of the
 former - as arm_hard_regno_mode_ok does...

 yes, when checking that that the operand register number is lower or
 equals to LAST_ARM_REGNUM the infinite loop is avoided.  I haven't
 pass a full validation so far, but it has the same effect than
 checking that the changing mode is authorized.  If you think that this
 checking makes more sense, I can rerun a full valid.



 It sounds to me like checking against LAST_ARM_REGNUM is the better approach. 
 The additional test in the movdi expander was to prevent illegitimate ldrd / 
 strd instructions from being generated.

 Thus, OK if there are no regressions - Please give it some time to bake on 
 trunk before backporting to FSF 5 and / or do some additional validation 
 (including the regression testsuite) before doing the backport.

No regressions on trunk for both armeb-linux-gnueabihf and
arm-linux-gnueabihf, thus I've committed it as r226811. I'll validate
the backport to FSF 5 before committing it.

Thanks
Yvan



 regards
 Ramana



 Thanks,
 Yvan


 yes


 --Alan



Re: [PATCH 01/15] rs6000: Split out rs6000_is_valid_and_mask_wide

2015-08-12 Thread Segher Boessenkool
On Tue, Aug 11, 2015 at 06:11:30PM -0700, Richard Henderson wrote:
 This allows testing for a mask without having to call GEN_INT.
 
 Cc: David Edelsohn dje@gmail.com
 ---
   * config/rs6000/rs6000.c (rs6000_is_valid_mask_wide): Split out from...
   (rs6000_is_valid_mask): ... here.
   (rs6000_is_valid_and_mask_wide): Split out from...
   (rs6000_is_valid_and_mask): ... here.

I don't like these _wide names much.  You could overload the shorter
name, if you really think creating some garbage const_int's is too much
overhead (it might well be if you use it a lot more in later patches).

The original functions really want rtx's since they are used like
predicates (so should look and behave like one); rs6000_is_valid_mask
itself is different (and a lousy name; suggestions welcome).

 -bool
 -rs6000_is_valid_mask (rtx mask, int *b, int *e, machine_mode mode)
 +static bool
 +rs6000_is_valid_mask_wide (unsigned HOST_WIDE_INT val, int *b, int *e, int n)

But why change the mode parameter?  The code was clearer before.

  {
 -  unsigned HOST_WIDE_INT val = INTVAL (mask);
unsigned HOST_WIDE_INT bit;
int nb, ne;
 -  int n = GET_MODE_PRECISION (mode);
  
 -  if (mode != DImode  mode != SImode)
 -return false;
 -
 -  if (INTVAL (mask) = 0)
 +  if ((HOST_WIDE_INT)val = 0)
^ missing space

  {
bit = val  -val;
ne = exact_log2 (bit);
 @@ -16430,27 +16427,54 @@ rs6000_is_valid_mask (rtx mask, int *b, int *e, 
 machine_mode mode)
return true;
  }
  
 +bool
 +rs6000_is_valid_mask (rtx mask, int *b, int *e, machine_mode mode)
 +{
 +  int n;
 +
 +  if (mode == DImode)
 +n = 64;
 +  else if (mode == SImode)
 +n = 32;
 +  else
 +return false;
 +
 +  unsigned HOST_WIDE_INT val = INTVAL (mask);
 +  return rs6000_is_valid_mask_wide (val, b, e, n);
 +}
 +
  /* Return whether MASK (a CONST_INT) is a valid mask for any rlwinm, rldicl,
 or rldicr instruction, to implement an AND with it in mode MODE.  */
  
 -bool
 -rs6000_is_valid_and_mask (rtx mask, machine_mode mode)
 +static bool
 +rs6000_is_valid_and_mask_wide (unsigned HOST_WIDE_INT val, machine_mode mode)
  {
int nb, ne;
  
 -  if (!rs6000_is_valid_mask (mask, nb, ne, mode))
 -return false;
 +  switch (mode)
 +{
 +case DImode:
 +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 64))
 + return false;
 +  /* For DImode, we need a rldicl, rldicr, or a rlwinm with
 +  mask that does not wrap.  */
 +  return (ne == 0 || nb == 63 || (nb  32  ne = nb));
  
 -  /* For DImode, we need a rldicl, rldicr, or a rlwinm with mask that
 - does not wrap.  */
 -  if (mode == DImode)
 -return (ne == 0 || nb == 63 || (nb  32  ne = nb));
 +case SImode:
 +  if (!rs6000_is_valid_mask_wide (val, nb, ne, 32))
 + return false;
 +  /* For SImode, rlwinm can do everything.  */
 +  return (nb  32  ne  32);
  
 -  /* For SImode, rlwinm can do everything.  */
 -  if (mode == SImode)
 -return (nb  32  ne  32);
 +default:
 +  return false;
 +}
 +}
  
 -  return false;

You don't need any of these changes then, either.

 +bool
 +rs6000_is_valid_and_mask (rtx mask, machine_mode mode)
 +{
 +  return rs6000_is_valid_and_mask_wide (UINTVAL (mask), mode);
  }
  
  /* Return the instruction template for an AND with mask in mode MODE, with
 @@ -16739,12 +16763,12 @@ rs6000_is_valid_2insn_and (rtx c, machine_mode mode)
  
/* Otherwise, fill in the lowest hole; if we can do the result with
   one insn, we can do the whole thing with two.  */
 -  unsigned HOST_WIDE_INT val = INTVAL (c);
 +  unsigned HOST_WIDE_INT val = UINTVAL (c);

Does it matter?


Segher


[gomp4] dimension API

2015-08-12 Thread Nathan Sidwell
I've committed this patch to gomp4.  It reworks the oacc functiuon attribute 
dimension handling.  Rather than pass the TREE_LIST to the various hooks, I 
convert it to a regular C array of ints.  That makes life simpler for the 
consumers.  They return a 'changed' flag to indicate whether the attrribute 
should be rewritten.  That rewriting is done in a way that doesn;t presume the 
attribute is unshared (Cesar, your workaround should no longer be necessary).


Also  put in the change I mentioned earlier this morning about the default 
validate dims hook setting the dimensions to 1 on accelerators to.  I'll  revert 
Thomas's patch shortly.


nathan
2015-08-12  Nathan Sidwell  nat...@codesourcery.com

	* target.def (validate_dims): Adjust API.
	* targhooks.h (default_goacc_validate_dims): Adjust.
	* omp-low.c (replace_oacc_fn_attrib): New function.
	(set_oacc_fn_attrib): Use it.
	(oacc_xform_dim): Dims is array of ints.
	(execute_oacc_transform): Create int array of dims, adjust uses.
	(default_goacc_validate_dims): Adjust API.  Force to w everywhere.
	* doc/tm.texi: Rebuild.
	* config/nvptx/nvptx.c (nvptx_validate_dims): Adjust API.

Index: gcc/config/nvptx/nvptx.c
===
--- gcc/config/nvptx/nvptx.c	(revision 226808)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -3538,71 +3538,46 @@ nvptx_expand_builtin (tree exp, rtx targ
   return d-expander (exp, target, mode, ignore);
 }
 
-/* Validate compute dimensions, fill in defaults.  */
-
-static tree
-nvptx_validate_dims (tree decl, tree dims)
-{
-  tree adims[GOMP_DIM_MAX];
-  unsigned ix;
-  tree *pos_ptr;
-
-  for (ix = 0, pos_ptr = dims; ix != GOMP_DIM_MAX;
-   ix++, pos_ptr = TREE_CHAIN (*pos_ptr))
-{
-  if (!*pos_ptr)
-	*pos_ptr = tree_cons (NULL_TREE, NULL_TREE, NULL_TREE);
-  
-  adims[ix] = TREE_VALUE (*pos_ptr);
-}
-
-  /* Define vector size for known hardware.  */
+/* Define vector size for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
 
+/* Validate compute dimensions, fill in non-unity defaults.  */
+
+static bool
+nvptx_validate_dims (tree decl, int dims[])
+{
+  bool changed = false;
+
   /* If the worker size is not 1, the vector size must be 32.  If
  the vector  size is not 1, it must be 32.  */
-  if ((adims[GOMP_DIM_WORKER]
-TREE_INT_CST_LOW (adims[GOMP_DIM_WORKER]) != 1)
-  || (adims[GOMP_DIM_VECTOR]
-	   TREE_INT_CST_LOW (adims[GOMP_DIM_VECTOR]) != 1))
+  if ((dims[GOMP_DIM_WORKER]  1 || dims[GOMP_DIM_WORKER] == 0)
+  || (dims[GOMP_DIM_VECTOR]  1 || dims[GOMP_DIM_VECTOR] == 0))
 {
-  if (!adims[GOMP_DIM_VECTOR]
-	  || TREE_INT_CST_LOW (adims[GOMP_DIM_VECTOR]) != PTX_VECTOR_LENGTH)
+  if (dims[GOMP_DIM_VECTOR] != PTX_VECTOR_LENGTH)
 	{
-	  tree use = build_int_cst (integer_type_node, PTX_VECTOR_LENGTH);
-	  if (adims[GOMP_DIM_VECTOR])
+	  if (dims[GOMP_DIM_VECTOR] = 0)
 	warning_at (DECL_SOURCE_LOCATION (decl), 0,
-			TREE_INT_CST_LOW (adims[GOMP_DIM_VECTOR])
-			? using vector_length (%E), ignoring %E
-			: using vector_length (%E), ignoring runtime setting,
-			use, adims[GOMP_DIM_VECTOR]);
-	  adims[GOMP_DIM_VECTOR] = use;
+			dims[GOMP_DIM_VECTOR]
+			? using vector_length (%d), ignoring %d
+			: using vector_length (%d), ignoring runtime setting,
+			PTX_VECTOR_LENGTH, dims[GOMP_DIM_VECTOR]);
+	  dims[GOMP_DIM_VECTOR] = PTX_VECTOR_LENGTH;
+	  changed = true;
 	}
 }
 
   /* Check the num workers is not too large.  */
-  if (adims[GOMP_DIM_WORKER]
-   TREE_INT_CST_LOW (adims[GOMP_DIM_WORKER])  PTX_WORKER_LENGTH)
+  if (dims[GOMP_DIM_WORKER]  PTX_WORKER_LENGTH)
 {
-  tree use = build_int_cst (integer_type_node, PTX_WORKER_LENGTH);
   warning_at (DECL_SOURCE_LOCATION (decl), 0,
-		  using num_workers (%E), ignoring %E,
-		  use, adims[GOMP_DIM_WORKER]);
-  adims[GOMP_DIM_WORKER] = use;
+		  using num_workers (%d), ignoring %d,
+		  PTX_WORKER_LENGTH, dims[GOMP_DIM_WORKER]);
+  dims[GOMP_DIM_WORKER] = PTX_WORKER_LENGTH;
+  changed = true;
 }
 
-  /* Set defaults.  */
-  for (ix = 0; ix != GOMP_DIM_MAX; ix++)
-if (!adims[ix])
-  adims[ix] = integer_one_node;
-
-  /* Write results.  */
-  tree pos;
-  for (ix = 0, pos = dims; ix != GOMP_DIM_MAX; ix++, pos = TREE_CHAIN (pos))
-TREE_VALUE (pos) = adims[ix];
-
-  return dims;
+  return changed;
 }
 
 /* Return maximum dimension size, or zero for unbounded.  */
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi	(revision 226808)
+++ gcc/doc/tm.texi	(working copy)
@@ -5740,11 +5740,12 @@ usable.  In that case, the smaller the n
 to use it.
 @end deftypefn
 
-@deftypefn {Target Hook} tree TARGET_GOACC_VALIDATE_DIMS (tree, @var{tree})
+@deftypefn {Target Hook} bool TARGET_GOACC_VALIDATE_DIMS (tree, int @var{[]})
 This hook should check the launch dimensions provided.  It should fill
-in default values and verify non-defaults.  The 

Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Mikael Morin

Le 12/08/2015 13:09, Richard Biener a écrit :

On Wed, Aug 12, 2015 at 1:07 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:

On 2015.08.12 at 13:01 +0200, Richard Biener wrote:

What kind of error does ubsan run into?  That is, for which 'prec'?


See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67042


Ugh.  Stupid C.

My fear is that with so many stmts sext_hwi isn't any longer considered for
inlining, even if prec is constant.  What's the generated code for its
out-of line
copy with/without the patch?



The difference on x86_64 is:

.cfi_startproc
cmpl$64, %esi
je  .L3
-   movl$64, %ecx
-   subl%esi, %ecx
-   salq%cl, %rdi
+   leal-1(%rsi), %ecx
+   movl$1, %eax
+   movq%rax, %rdx
+   salq%cl, %rdx
+   movl%esi, %ecx
+   salq%cl, %rax
+   subq$1, %rax
+   andq%rax, %rdi
+   xorq%rdx, %rdi
movq%rdi, %rax
-   sarq%cl, %rax
+   subq%rdx, %rax
ret
.p2align 4,,10
.p2align 3

with -O2, tests attached.
Yes it's worse.  I thought it was better to avoid undefined behaviour at 
all prices to avoid bad surprises.


Mikael





typedef signed long long HOST_WIDE_INT;

#define HOST_BITS_PER_WIDE_INT (8 * sizeof(HOST_WIDE_INT))

#define HOST_WIDE_INT_C(X) X ## LL
#define HOST_WIDE_INT_UC(X) HOST_WIDE_INT_C (X ## U)
#define HOST_WIDE_INT_1 HOST_WIDE_INT_C (1)
#define HOST_WIDE_INT_1U HOST_WIDE_INT_UC (1)
#define HOST_WIDE_INT_M1 HOST_WIDE_INT_C (-1)
#define HOST_WIDE_INT_M1U HOST_WIDE_INT_UC (-1)

#define gcc_checking_assert(test)

HOST_WIDE_INT
sext_hwi (HOST_WIDE_INT src, unsigned int prec)
{
  if (prec == HOST_BITS_PER_WIDE_INT)
return src;
  else
{
  gcc_checking_assert (prec  HOST_BITS_PER_WIDE_INT);
  int shift = HOST_BITS_PER_WIDE_INT - prec;
  return (src  shift)  shift;
}
}





typedef signed long long HOST_WIDE_INT;

#define HOST_BITS_PER_WIDE_INT (8 * sizeof(HOST_WIDE_INT))

#define HOST_WIDE_INT_C(X) X ## LL
#define HOST_WIDE_INT_UC(X) HOST_WIDE_INT_C (X ## U)
#define HOST_WIDE_INT_1 HOST_WIDE_INT_C (1)
#define HOST_WIDE_INT_1U HOST_WIDE_INT_UC (1)
#define HOST_WIDE_INT_M1 HOST_WIDE_INT_C (-1)
#define HOST_WIDE_INT_M1U HOST_WIDE_INT_UC (-1)

#define gcc_checking_assert(test)

HOST_WIDE_INT
sext_hwi (HOST_WIDE_INT src, unsigned int prec)
{
  if (prec == HOST_BITS_PER_WIDE_INT)
return src;
  else
{
  gcc_checking_assert (prec  HOST_BITS_PER_WIDE_INT);
  HOST_WIDE_INT sign_mask = HOST_WIDE_INT_1  (prec - 1);
  HOST_WIDE_INT value_mask = (HOST_WIDE_INT_1U  prec) - HOST_WIDE_INT_1U;
  return (((src  value_mask) ^ sign_mask) - sign_mask);
}
}





Re: [PATCH 06/15] rs6000: Use rldiwi in constant construction

2015-08-12 Thread Segher Boessenkool
On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
 @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
if (exam_mask (-1, c, sub_budget))
  return true;
  
 +  /* If the two halves are equal, use an insert.  */
 +  if (c  32 == test  exam_sub (test, sub_budget))
 +{
 +  opN (VEC_DUPLICATE, 0xu); /* RLDIMI */
 +  return true;
 +}

Does this work for c with the high bit set?  I think you need
to cast it to unsigned HOST_WIDE_INT first?


Segher


[PATCH, PR67092, PR67098 ] Remove --with-host-libstdcxx

2015-08-12 Thread Tom de Vries

Hi,

this patch removes configure option --with-host-libstdcxx.

[ As suggested here ( 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67092#c13 ):

...
I think we can no longer reliably support host libstdc++ as includes are 
not picked up from its location and GCC is C++ now.


I suggest to remove that entirely.
...
]

The option was originally introduced to support linking with a static 
version of ppl, but we no longer support using ppl.


The behaviour of --with-host-libstdcxx is implemented in terms of other 
configure options, so we're not losing any functionality.



Furthermore, the patch adds the missing documentation of the default 
behaviour of --with-stage1-ldflags.



Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom
Remove --with-host-libstdcxx

2015-08-12  Tom de Vries  t...@codesourcery.com

	PR other/67092
	PR other/67098
	* doc/install.texi: Remove --with_host_libstdcxx item.  Update
	--with-stage1-libs, --with-boot-ldflags and --with-boot-libs items
	accordingly.  Mention default for --with-stage1-ldflags.
	* configure.ac: Remove --with_host_libstdcxx support.
	* configure: Regenerate.
---
 configure| 24 +++-
 configure.ac | 19 +++
 gcc/doc/install.texi | 23 +++
 3 files changed, 13 insertions(+), 53 deletions(-)

diff --git a/configure b/configure
index 79257fd..eca5e6f 100755
--- a/configure
+++ b/configure
@@ -772,7 +772,6 @@ with_gmp_dir
 with_gmp
 with_gmp_include
 with_gmp_lib
-with_host_libstdcxx
 with_stage1_libs
 with_stage1_ldflags
 with_boot_libs
@@ -1540,8 +1539,6 @@ Optional Packages:
   --with-gmp-lib=PATH/lib
   --with-gmp-include=PATH specify directory for installed GMP include files
   --with-gmp-lib=PATH specify directory for the installed GMP library
-  --with-host-libstdcxx=L use linker arguments L to link with libstdc++ when
-  linking with PPL
   --with-stage1-libs=LIBS libraries for stage1
   --with-stage1-ldflags=FLAGS
   linker flags for stage1
@@ -5849,20 +5846,6 @@ fi
 
 
 
-# Allow host libstdc++ to be specified for static linking with PPL.
-
-# Check whether --with-host-libstdcxx was given.
-if test ${with_host_libstdcxx+set} = set; then :
-  withval=$with_host_libstdcxx;
-fi
-
-
-case $with_host_libstdcxx in
-  no|yes)
-as_fn_error -with-host-libstdcxx needs an argument $LINENO 5
-;;
-esac
-
 # Libraries to use for stage1 or when not bootstrapping.
 
 # Check whether --with-stage1-libs was given.
@@ -5873,7 +5856,7 @@ if test ${with_stage1_libs+set} = set; then :
stage1_libs=$withval
  fi
 else
-  stage1_libs=$with_host_libstdcxx
+  stage1_libs=
 fi
 
 
@@ -5899,8 +5882,7 @@ fi
 
 
 
-# Libraries to use for stage2 and later builds.  This defaults to the
-# argument passed to --with-host-libstdcxx.
+# Libraries to use for stage2 and later builds.
 
 # Check whether --with-boot-libs was given.
 if test ${with_boot_libs+set} = set; then :
@@ -5910,7 +5892,7 @@ if test ${with_boot_libs+set} = set; then :
poststage1_libs=$withval
  fi
 else
-  poststage1_libs=$with_host_libstdcxx
+  poststage1_libs=
 fi
 
 
diff --git a/configure.ac b/configure.ac
index 452fc05..9241261 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1697,18 +1697,6 @@ AC_SUBST(extra_mpc_gmp_configure_flags)
 AC_SUBST(extra_mpc_mpfr_configure_flags)
 AC_SUBST(extra_isl_gmp_configure_flags)
 
-# Allow host libstdc++ to be specified for static linking with PPL.
-AC_ARG_WITH(host-libstdcxx,
-[AS_HELP_STRING([--with-host-libstdcxx=L],
-		[use linker arguments L to link with libstdc++
-		 when linking with PPL])])
-
-case $with_host_libstdcxx in
-  no|yes)
-AC_MSG_ERROR([-with-host-libstdcxx needs an argument])
-;;
-esac
-
 # Libraries to use for stage1 or when not bootstrapping.
 AC_ARG_WITH(stage1-libs,
 [AS_HELP_STRING([--with-stage1-libs=LIBS], [libraries for stage1])],
@@ -1717,7 +1705,7 @@ AC_ARG_WITH(stage1-libs,
  else
stage1_libs=$withval
  fi],
-[stage1_libs=$with_host_libstdcxx])
+[stage1_libs=])
 AC_SUBST(stage1_libs)
 
 # Linker flags to use for stage1 or when not bootstrapping.
@@ -1737,8 +1725,7 @@ AC_ARG_WITH(stage1-ldflags,
  fi])
 AC_SUBST(stage1_ldflags)
 
-# Libraries to use for stage2 and later builds.  This defaults to the
-# argument passed to --with-host-libstdcxx.
+# Libraries to use for stage2 and later builds.
 AC_ARG_WITH(boot-libs,
 [AS_HELP_STRING([--with-boot-libs=LIBS], [libraries for stage2 and later])],
 [if test $withval = no -o $withval = yes; then
@@ -1746,7 +1733,7 @@ AC_ARG_WITH(boot-libs,
  else
poststage1_libs=$withval
  fi],
-[poststage1_libs=$with_host_libstdcxx])
+[poststage1_libs=])
 AC_SUBST(poststage1_libs)
 
 # Linker flags to use for stage2 and later builds.
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 0e3093f..360b066 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1853,36 +1853,27 @@ include and lib options 

Re: [Aarch64][target/PR 67143] Use correct constraints on operands for atomic operations.

2015-08-12 Thread Matthew Wahab

On 11/08/15 15:07, James Greenhalgh wrote:

On Tue, Aug 11, 2015 at 02:05:37PM +0100, Matthew Wahab wrote:


This patch reworks the atomic operation patterns to select the
appropriate constraint for the operation. The logical operations take
the constraints specified by the current lconst_atomic mode iterator
while the arithmetic operations (plus, sub) take constraint I.


I'm worried this still gives us a mismatch between constraints and
predicates. The relevant predicates here are:

   (define_predicate aarch64_plus_operand
 (ior (match_operand 0 register_operand)
  (match_operand 0 aarch64_plus_immediate)))

   (define_predicate aarch64_plus_immediate
 (and (match_code const_int)
  (ior (match_test aarch64_uimm12_shift (INTVAL (op)))
  (match_test aarch64_uimm12_shift (-INTVAL (op))

But our constraint only permits:

   (define_constraint I
A constant that can be used with an ADD operation.
(and (match_code const_int)
 (match_test aarch64_uimm12_shift (ival

Does this mean we are now loading constants we don't need to in to
registers? I don't think we could cause this to ICE - but are we
generating less good code than we would like?


Updated the patch to make the constraints for the arithmetic operations IJ,
which preserves the existing behaviour. Also added two cases to the
gcc.target/aarch64/atomic-op-imm.c test to check the behaviour with large
nagative numbers.

Tested for aarch64-none-eabi with cross-compiled check-gcc and for
aarch64-none-linux-gnu with native bootstrap and make check.

Ok?
Matthew

gcc/
2015-08-12  Matthew Wahab  matthew.wa...@arm.com

PR target/67143
* config/aarch64/atomic.md (atomic_optabmode): Replace
'lconst_atomic' with 'const_atomic'.
(atomic_fetch_optabmode): Likewise.
(atomic_optab_fetchmode): Likewise.
* config/aarch64/iterators.md (lconst-atomic): Move below
'const_atomic'.
(const_atomic): New.

gcc/testsuite/
2015-08-12  Matthew Wahab  matthew.wa...@arm.com
Matthias Klose  d...@debian.org

PR target/67143
* gcc.target/aarch64/atomic-op-imm.c
(atomic_fetch_add_negative_RELAXED): New.
(atomic_fetch_sub_negative_ACQUIRE): New.
* gcc.target/aarch64/pr67143.c: New

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 1a38ac0..6e6be99 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -119,7 +119,7 @@
   [(set (match_operand:ALLI 0 aarch64_sync_memory_operand +Q)
 (unspec_volatile:ALLI
   [(atomic_op:ALLI (match_dup 0)
-	(match_operand:ALLI 1 atomic_op_operand rlconst_atomic))
+	(match_operand:ALLI 1 atomic_op_operand rconst_atomic))
(match_operand:SI 2 const_int_operand)]		;; model
   UNSPECV_ATOMIC_OP))
(clobber (reg:CC CC_REGNUM))
@@ -164,7 +164,7 @@
(set (match_dup 1)
 (unspec_volatile:ALLI
   [(atomic_op:ALLI (match_dup 1)
-	(match_operand:ALLI 2 atomic_op_operand rlconst_atomic))
+	(match_operand:ALLI 2 atomic_op_operand rconst_atomic))
(match_operand:SI 3 const_int_operand)]		;; model
   UNSPECV_ATOMIC_OP))
(clobber (reg:CC CC_REGNUM))
@@ -209,7 +209,7 @@
   [(set (match_operand:ALLI 0 register_operand =r)
 (atomic_op:ALLI
   (match_operand:ALLI 1 aarch64_sync_memory_operand +Q)
-  (match_operand:ALLI 2 atomic_op_operand rlconst_atomic)))
+  (match_operand:ALLI 2 atomic_op_operand rconst_atomic)))
(set (match_dup 1)
 (unspec_volatile:ALLI
   [(match_dup 1) (match_dup 2)
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 5d7966d..b8a45d1 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -345,9 +345,6 @@
 ;; Attribute to describe constants acceptable in logical operations
 (define_mode_attr lconst [(SI K) (DI L)])
 
-;; Attribute to describe constants acceptable in atomic logical operations
-(define_mode_attr lconst_atomic [(QI K) (HI K) (SI K) (DI L)])
-
 ;; Map a mode to a specific constraint character.
 (define_mode_attr cmode [(QI q) (HI h) (SI s) (DI d)])
 
@@ -843,6 +840,16 @@
(plus aarch64_plus_operand)
(minus aarch64_plus_operand)])
 
+;; Constants acceptable for atomic operations.
+;; This definition must appear in this file before the iterators it refers to.
+(define_code_attr const_atomic
+ [(plus IJ) (minus IJ)
+  (xor lconst_atomic) (ior lconst_atomic)
+  (and lconst_atomic)])
+
+;; Attribute to describe constants acceptable in atomic logical operations
+(define_mode_attr lconst_atomic [(QI K) (HI K) (SI K) (DI L)])
+
 ;; ---
 ;; Int Iterators.
 ;; ---
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
index 6c6f7e1..47d7a96 100644
--- 

Re: [PATCH 04/15] rs6000: Implement set_const_data infrastructure

2015-08-12 Thread Segher Boessenkool
Hi Richard,

You wanted us to read this file...

On Tue, Aug 11, 2015 at 06:11:33PM -0700, Richard Henderson wrote:
 +   -- The fallback generation for the most complex word_mode constants.
 +  The receipe built will be the full MAX_COST insns, as we will
^-- typo.


Segher


Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread Richard Sandiford
Jeff Law l...@redhat.com writes:
 On 08/10/2015 06:05 AM, tbsaunde+...@tbsaunde.org wrote:
 From: Trevor Saunders tbsaunde+...@tbsaunde.org

 Hi,

 In many places gcc puts classes in the anon namespace so the compiler can 
 tell
 they do not get inheritted from to enable better devirtualization.  However
 debugging code in the anon namespace can be a pain, and the same thing can be
 accomplished more directly by marking the classes as final.  When
 bootstrapping
 stage 3 should always be built in C++14 mode now, and of course will always 
 be
 newer than gcc 4.7, so these classes will always be marked as final there.
 AIUI cross compilers are supposed to be built with recent gcc, which I would
 tend to think implies newer than 4.7, so they should also be built with these
 classes marked as final.  I believe that means in all important cases
 this works just as well as the anon namespace.

 bootstrapped + regtested on x86_64-linux-gnu, ok?

 Trev


 gcc/ChangeLog:

 2015-08-10  Trevor Saunders  tbsau...@tbsaunde.org

  * compare-elim.c, dce.c, dse.c, gimple-ssa-isolate-paths.c,
  gimple-ssa-strength-reduction.c, graphite.c, init-regs.c,
  ipa-pure-const.c, ipa-visibility.c, ipa.c, mode-switching.c,
  omp-low.c, reorg.c, sanopt.c, trans-mem.c, tree-eh.c,
  tree-if-conv.c, tree-ssa-copyrename.c, tree-ssa-dce.c,
  tree-ssa-dom.c, tree-ssa-dse.c, tree-ssa-forwprop.c,
  tree-ssa-sink.c, tree-ssanames.c, tree-stdarg.c, tree-tailcall.c,
  tree-vect-generic.c, tree.c, ubsan.c, var-tracking.c,
  vtable-verify.c, web.c: Use GCC_FINAL instead of the anonymous
 namespace.
 OK.

I was hoping someone else was going to speak up since I seem
to have been posting a few negative messages recently, but I think
this is really a step in the wrong direction.  I think the code
was using anonymous namespaces in exactly the way they were
intended to be used.

We don't want to forbid the use of static and instead force every
file-local function and variable into the global namespace.  Why should
classes be any different?  Anonymous clases are a commonly-used feature,
so if there's a specific problem with debugging them, I think it would
be better to fix gdb (or whatever) rather than work around the problem
like this.

Thanks,
Richard



Re: [PATCH 2/2] replace several uses of the anon namespace with GCC_FINAL

2015-08-12 Thread Jeff Law

On 08/12/2015 12:57 PM, Richard Sandiford wrote:

Jeff Law l...@redhat.com writes:

On 08/10/2015 06:05 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

In many places gcc puts classes in the anon namespace so the compiler can tell
they do not get inheritted from to enable better devirtualization.  However
debugging code in the anon namespace can be a pain, and the same thing can be
accomplished more directly by marking the classes as final.  When
bootstrapping
stage 3 should always be built in C++14 mode now, and of course will always be
newer than gcc 4.7, so these classes will always be marked as final there.
AIUI cross compilers are supposed to be built with recent gcc, which I would
tend to think implies newer than 4.7, so they should also be built with these
classes marked as final.  I believe that means in all important cases
this works just as well as the anon namespace.

bootstrapped + regtested on x86_64-linux-gnu, ok?

Trev


gcc/ChangeLog:

2015-08-10  Trevor Saunders  tbsau...@tbsaunde.org

* compare-elim.c, dce.c, dse.c, gimple-ssa-isolate-paths.c,
gimple-ssa-strength-reduction.c, graphite.c, init-regs.c,
ipa-pure-const.c, ipa-visibility.c, ipa.c, mode-switching.c,
omp-low.c, reorg.c, sanopt.c, trans-mem.c, tree-eh.c,
tree-if-conv.c, tree-ssa-copyrename.c, tree-ssa-dce.c,
tree-ssa-dom.c, tree-ssa-dse.c, tree-ssa-forwprop.c,
tree-ssa-sink.c, tree-ssanames.c, tree-stdarg.c, tree-tailcall.c,
tree-vect-generic.c, tree.c, ubsan.c, var-tracking.c,
vtable-verify.c, web.c: Use GCC_FINAL instead of the anonymous
namespace.

OK.


I was hoping someone else was going to speak up since I seem
to have been posting a few negative messages recently, but I think
this is really a step in the wrong direction.  I think the code
was using anonymous namespaces in exactly the way they were
intended to be used.

No need to worry about seeming to be negative.


The problem is you can't get to stuff in the anonymous namespace easily 
in the debugger.  There was talk of fixing that, but I don't think it 
ever went forward on the gdb side.


If gdb were to get fixed so that debugging this stuff was easier, then 
I'd fully support putting things back into the anonymous namespace.


jeff



Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Mike Stump
On Aug 12, 2015, at 11:07 AM, Jeff Law l...@redhat.com wrote:
 On 08/12/2015 11:12 AM, Richard Biener wrote:
 
 Prec is almost never a constant and is heavily used from wide-int.
 
 We are not exploiting this undefined ness in C so I object to making this so 
 much slower.
 
 Can we instead do what we do for abs_hwi and add a checking assert so we can 
 move the tests to the callers that misbehave instead?
 Given that ISO C++ is moving away from making shifting 1 into the sign bit 
 undefined behaviour, maybe we should make UBSan less strict in its warning.  
 That may eliminate the need for Mikael's patch.

Pedantically, then, we have to move the the later language standard (if this 
was standardese).  If it was DRese, then we’re safe.  :-)



Re: Fix reload1.c warning for some targets

2015-08-12 Thread Jeff Law

On 08/05/2015 08:18 AM, Richard Sandiford wrote:

Building some targets results in a warning about orig_dup[i] potentially
being used uninitialised.  I think the warning is fair, since it isn't
obvious that the reog_data-based loop bound remains unchanged between:

   for (i = 0; i  recog_data.n_dups; i++)
 orig_dup[i] = *recog_data.dup_loc[i];

and:

   for (i = 0; i  recog_data.n_dups; i++)
 *recog_data.dup_loc[i] = orig_dup[i];

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard

gcc/
* reload1.c (elimination_costs_in_insn): Make it obvious to the
compiler that the n_dups and n_operands loop bounds are invariant.

So thinking more about this, I think the best way forward is to:

  1. Create a new BZ with the false positive extracted from c#4.

  2. Install your patch and close 55035.

I'll take care of #1, you can handle #2.


jeff



C++ PATCH for c++/67104 (wrong handling of array and constexpr)

2015-08-12 Thread Jason Merrill
cxx_eval_array_reference was assuming that the CONSTRUCTOR for an array 
has one entry per array element, in order.  But 
cxx_eval_store_expression doesn't try to create such a CONSTRUCTOR, and 
other places use RANGE_EXPRs, so we need to be prepared to handle these 
cases.


Tested x86_64-pc-linux-gnu, applying to trunk and 5.
commit 1aa0b7f99a8bbea646c9bf4600f32200b7400ace
Author: Jason Merrill ja...@redhat.com
Date:   Wed Aug 12 17:56:21 2015 +0100

	PR c++/67104
	* constexpr.c (cxx_eval_array_reference): Handle sparse
	CONSTRUCTORs.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 218faec..b6788c7 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1697,7 +1697,38 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   VERIFY_CONSTANT (ary);
   gcc_unreachable ();
 }
-  if (compare_tree_int (index, len) = 0)
+
+  i = tree_to_shwi (index);
+  bool found = true;
+  if (TREE_CODE (ary) == CONSTRUCTOR  len
+   (TREE_CODE (CONSTRUCTOR_ELT (ary, len-1)-index) == RANGE_EXPR
+	  || compare_tree_int (CONSTRUCTOR_ELT (ary, len-1)-index, len-1)))
+{
+  /* The last element doesn't match its position in the array; this must be
+	 a sparse array from cxx_eval_store_expression.  So iterate.  */
+  found = false;
+  vecconstructor_elt, va_gc *v = CONSTRUCTOR_ELTS (ary);
+  constructor_elt *e;
+  for (unsigned ix = 0; vec_safe_iterate (v, ix, e); ++i)
+	{
+	  if (TREE_CODE (e-index) == RANGE_EXPR)
+	{
+	  tree lo = TREE_OPERAND (e-index, 0);
+	  tree hi = TREE_OPERAND (e-index, 1);
+	  if (tree_int_cst_le (lo, index)  tree_int_cst_le (index, hi))
+		found = true;
+	}
+	  else if (tree_int_cst_equal (e-index, index))
+	found = true;
+	  if (found)
+	{
+	  i = ix;
+	  break;
+	}
+	}
+}
+
+  if (i = len || !found)
 {
   if (tree_int_cst_lt (index, array_type_nelts_top (TREE_TYPE (ary
 	{
@@ -1714,14 +1745,14 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   *non_constant_p = true;
   return t;
 }
-  else if (tree_int_cst_lt (index, integer_zero_node))
+  else if (i  0)
 {
   if (!ctx-quiet)
 	error (negative array subscript);
   *non_constant_p = true;
   return t;
 }
-  i = tree_to_shwi (index);
+
   if (TREE_CODE (ary) == CONSTRUCTOR)
 return (*CONSTRUCTOR_ELTS (ary))[i].value;
   else if (elem_nchars == 1)
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-array1.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-array1.C
new file mode 100644
index 000..efe4617
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-array1.C
@@ -0,0 +1,20 @@
+// PR c++/67104
+// { dg-do compile { target c++14 } }
+
+template typename T, int N struct array
+{
+  constexpr T operator[](int index) { return data[index]; }
+  constexpr T operator[](int index) const { return data[index]; }
+  T data[N];
+};
+
+constexpr arraylong unsigned, 1001
+make_bottle_count ()
+{
+  arraylong unsigned, 1001 a{};
+  a[65] = 1;
+  return a;
+}
+
+constexpr auto bottle_count = make_bottle_count ();
+static_assert (bottle_count[65], );


[gomp4] declare directive

2015-08-12 Thread James Norris


Hi,

The attached patch is a revision of the functionality required to
handle the declare directive. The motivation for the change was that
the original code was deemed to fragile and difficult to maintain.
The new code is now smaller, in terms of line count, and hopefully,
easier to understand.  Given the significant amount of changes, I've
included a commentary on entire functions, rather than comments
about the specific changes. Hopefully, this will help in the
review process.

gimple.c : gimplify_oacc_declare ()

The first part of this function deals with handling
the entry clauses. After scanning the clauses, we
spin through the list and while adding the appropriate
ones to the context variable list. Also if any
'oacc declare' attributes are around, toss them.

The next section deals with the return clauses.
Again scan them, but after scanning them toss the
context that was created as a result of the scan.
We don't need the context as the context that was
created with the entry clauses will suffice. After
creation of the gimple statement, save it away so
that it can be used at exit time.

gimple.c : gimplify_body ()

This is the place where the 'exit' clauses are
used and inserted via a gimple statement. This
is also the place where the omp context is
'unwound'.

fortran/trans-decl.c: finish_oacc_declare ()

This function is called to handle the declare
directives. The information that was contained
in the directives during parsing of the variable
section is hanging off the namespace (ns).

The primary function here is to create the two
sets of derived clauses and associate them with
a newly created statement, i.e, gfc_code. This
new statement is then inserted at the beginning
of the statement chain associated with 'ns'.

c/c-parser.c c_parser_oacc_declare ()

The initial section of the function is doing
syntax checking. If no errors are found and
the variable is 'global', then an attribute
is added to the variable. This is followed by
the creation of the derived clauses.

In the final section, when a global is at hand,
a constructor is created. The constructor will
pass on information to the runtime. On the other
hand if the variable is local, then a statement
will be inserted and created that will contain
both sets of derived clauses.

cp/parser.c: cp_parser_oacc_declare ()

The initial section of the function is doing
syntax checking. If no errors are found and
the variable is 'global', then an attribute
is added to the variable. This is followed by
the creation of the derived clauses.

In the final section, when a global is at hand,
a constructor is created. The constructor will
pass on information to the runtime. On the other
hand if the variable is local, then a statement
will be inserted and created that will contain
both sets of derived clauses.

I'll commit this to gomp-4_0-branch sometime during the week of 17 August.

Thanks!
Jim
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index b452235..3f3e8c0 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1452,319 +1452,6 @@ c_parser_external_declaration (c_parser *parser)
 }
 }
 
-static tree
-check_oacc_vars_1 (tree *tp, int *, void *l)
-{
-  if (TREE_CODE (*tp) == VAR_DECL  TREE_PUBLIC (*tp))
-{
-  location_t loc = DECL_SOURCE_LOCATION (*tp);
-  tree attrs;
-  attrs = lookup_attribute (oacc declare, DECL_ATTRIBUTES (*tp));
-  if (attrs)
-	{
-	  tree t;
-
-	  for (t = TREE_VALUE (attrs); t; t = TREE_CHAIN (t))
-	{
-	  loc = EXPR_LOCATION ((tree) l);
-
-	  if (OMP_CLAUSE_MAP_KIND (TREE_VALUE (t)) == GOMP_MAP_LINK)
-		{
-		  error_at (loc, %link% clause cannot be used with %qE,
-			*tp);
-		  break;
-		}
-	}
-	}
-  else
-	error_at (loc, no %#pragma acc declare% for %qE, *tp);
-}
-  return NULL_TREE;
-}
-
-static tree
-check_oacc_vars (tree *tp, int *, void *)
-{
-  if (TREE_CODE (*tp) == STATEMENT_LIST)
-{
-  tree_stmt_iterator i;
-
-  for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i))
-	{
-	  tree t = tsi_stmt (i);
-	  walk_tree_without_duplicates (t, check_oacc_vars_1, t);
-	}
-}
-
-  return NULL_TREE;
-}
-
-static struct oacc_return
-{
-  tree_stmt_iterator iter;
-  tree stmt;
-  int op;
-  struct oacc_return *next;
-} *oacc_returns;
-
-static tree
-find_oacc_return (tree *tp, int *, void *)
-{
-  if (TREE_CODE (*tp) == STATEMENT_LIST)
-{
-  tree_stmt_iterator i;
-
-  for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i))
-	{
-	  tree t;
-	  struct oacc_return *r;
-
-	  t = tsi_stmt (i);
-
-	  if (TREE_CODE (t) == RETURN_EXPR)
-	{
-	  r = XNEW (struct 

Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Richard Biener
On August 12, 2015 8:07:13 PM GMT+02:00, Jeff Law l...@redhat.com wrote:
On 08/12/2015 11:12 AM, Richard Biener wrote:


 Prec is almost never a constant and is heavily used from wide-int.

 We are not exploiting this undefined ness in C so I object to making
this so much slower.

 Can we instead do what we do for abs_hwi and add a checking assert so
we can move the tests to the callers that misbehave instead?
Given that ISO C++ is moving away from making shifting 1 into the sign 
bit undefined behaviour, maybe we should make UBSan less strict in its 
warning.  That may eliminate the need for Mikael's patch.

We can also use an logical left shift followed by an arithmetic right shift. Or 
is the latter invoking undefined behaviour as well in some cases we hit?

Richard.

jeff




Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Jeff Law

On 08/12/2015 12:32 PM, Richard Biener wrote:

On August 12, 2015 8:07:13 PM GMT+02:00, Jeff Law l...@redhat.com wrote:

On 08/12/2015 11:12 AM, Richard Biener wrote:



Prec is almost never a constant and is heavily used from wide-int.

We are not exploiting this undefined ness in C so I object to making

this so much slower.


Can we instead do what we do for abs_hwi and add a checking assert so

we can move the tests to the callers that misbehave instead?
Given that ISO C++ is moving away from making shifting 1 into the sign
bit undefined behaviour, maybe we should make UBSan less strict in its
warning.  That may eliminate the need for Mikael's patch.


We can also use an logical left shift followed by an arithmetic right shift. Or 
is the latter invoking undefined behaviour as well in some cases we hit?
Hmm, why aren't we using logicals for the left shift to begin with? 
That's the problem area.  I don't think the right shifts are an issue at 
all.


It's strange that when I was researching this, consistently I saw folks 
suggesting the LEFT-SHIFT followed by RIGHT-SHIFT, then it'd get shot 
down as UB, then folks went to either Mikael's approach or another that 
is very similar.  Nobody suggested twidding the types to get a 
left-logical-shift, then doing a right-arithmetic-shift.


Jeff



Go patch committed: Don't make GC symbol for invalid array type

2015-08-12 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to not make a GC
symbol for an invalid array type.  This avoids a compiler crash and
fixes https://golang.org/issue/11539 .  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 226795)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-55175f7ee0db2c1e68423216d7744be80071ed6c
+5fc38e74d132cd6f4e7b56e6bcf9fe57031ab203
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 226510)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -6423,7 +6423,10 @@ Array_type::array_gc_symbol(Gogo* gogo,
   unsigned long bound;
   if (!this-length_-numeric_constant_value(nc)
   || nc.to_unsigned_long(bound) == Numeric_constant::NC_UL_NOTINT)
-go_assert(saw_errors());
+{
+  go_assert(saw_errors());
+  return;
+}
 
   Btype* pbtype = gogo-backend()-pointer_type(gogo-backend()-void_type());
   int64_t pwidth = gogo-backend()-type_size(pbtype);


C++ PATCH for c++/67108 (ICE with constexpr and tree dumps)

2015-08-12 Thread Jason Merrill
We can't do tsubsting once cgraph starts throwing away front end 
information; use the at_eof flag to communicate that we've reached that 
point.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 61a9d0354261705979003d15ebd7c97605d6dc2e
Author: Jason Merrill ja...@redhat.com
Date:   Wed Aug 12 15:54:06 2015 +0100

	PR c++/67108
	* decl2.c (c_parse_final_cleanups): Set at_eof to 2 at end.
	* error.c (dump_template_bindings): Don't tsubst in that case.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 78fd4af..ab6b3ec 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4797,7 +4797,8 @@ extern GTY(()) vectree, va_gc *local_classes;
 #endif /* !defined(NO_DOLLAR_IN_LABEL) || !defined(NO_DOT_IN_LABEL) */
 
 
-/* Nonzero if we're done parsing and into end-of-file activities.  */
+/* Nonzero if we're done parsing and into end-of-file activities.
+   Two if we're done with front-end processing.  */
 
 extern int at_eof;
 
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 068d79c..8e7a453 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4846,6 +4846,9 @@ c_parse_final_cleanups (void)
 
   timevar_stop (TV_PHASE_DEFERRED);
   timevar_start (TV_PHASE_PARSING);
+
+  /* Indicate that we're done with front end processing.  */
+  at_eof = 2;
 }
 
 /* Perform any post compilation-proper cleanups for the C++ front-end.
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index ae3e092..faf8744 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -339,6 +339,11 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
!DECL_LANG_SPECIFIC (current_function_decl))
 return;
 
+  /* Don't try to do this once cgraph starts throwing away front-end
+ information.  */
+  if (at_eof = 2)
+return;
+
   FOR_EACH_VEC_SAFE_ELT (typenames, i, t)
 {
   if (need_semicolon)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-targ3.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-targ3.C
new file mode 100644
index 000..d1e4482
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-targ3.C
@@ -0,0 +1,40 @@
+// PR c++/67108
+// { dg-do compile { target c++11 } }
+
+template  typename, typename  struct is_same;
+template  typename T  struct is_same T, T 
+{
+  enum
+  {
+value = true
+  }
+   ;
+constexpr bool operator () ()
+  {
+return value;
+  }
+}
+ ;
+template  bool, typename = void struct enable_if;
+template  typename T  struct enable_if true, T 
+{
+  typedef T type;
+}
+ ;
+struct A;
+template  typename, typename = void struct F;
+template  typename X  struct F X, typename enable_if  is_same  X, A 
+{
+}
+  () ::type 
+{
+  template  typename MakeDependent  F (MakeDependent)
+  {
+  }
+}
+;
+
+int main ()
+{
+  F  A  (1);
+}


Re: [C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Jeff Law

On 08/12/2015 11:28 AM, Toon Moene wrote:

On 08/12/2015 05:39 PM, Marek Polacek wrote:


This patch fixes a defect in -Wshift-overflow.  We should only warn
about left-shifting 1 into the sign bit when -Wshift-overflow=2.  But
this doesn't apply only for 1  31, but also for 2  30, etc.
In C++14, never warn about this.


And then there's this:

https://gcc.gnu.org/ml/gcc-testresults/2015-08/msg01036.html

[ Yes, that's at run time, not compile time ... ]

Hoping some of those are fixed by the sext_hwi changes from Mikael.

jeff






Re: [C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Marek Polacek
On Wed, Aug 12, 2015 at 11:33:21AM -0600, Jeff Law wrote:
 On 08/12/2015 11:28 AM, Toon Moene wrote:
 https://gcc.gnu.org/ml/gcc-testresults/2015-08/msg01036.html
 
 [ Yes, that's at run time, not compile time ... ]
 Hoping some of those are fixed by the sext_hwi changes from Mikael.

I hope too ;).  But they have nothing to do with the -Wshift-overflow IMHO.

Marek


Re: [C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Jeff Law

On 08/12/2015 11:40 AM, Marek Polacek wrote:

On Wed, Aug 12, 2015 at 11:33:21AM -0600, Jeff Law wrote:

On 08/12/2015 11:28 AM, Toon Moene wrote:

https://gcc.gnu.org/ml/gcc-testresults/2015-08/msg01036.html

[ Yes, that's at run time, not compile time ... ]

Hoping some of those are fixed by the sext_hwi changes from Mikael.


I hope too ;).  But they have nothing to do with the -Wshift-overflow IMHO.
Right.  Toon was just pointing out that we're getting the left-shift 
errors from ubsan all over the place.  Those are totally independent if 
your -Wshift-overflow stuff.


Jeff


Re: [PATCH] xtensa: add -mauto-litpools option

2015-08-12 Thread Max Filippov
On Wed, Aug 12, 2015 at 7:35 PM, augustine.sterl...@gmail.com
augustine.sterl...@gmail.com wrote:
 On Tue, Aug 11, 2015 at 6:09 PM, Max Filippov jcmvb...@gmail.com wrote:
 With support from assembler this option allows compiling huge functions,
 where single literal pool at the beginning of a function may not be
 reachable by L32R instructions at its end.

 Currently assembler --auto-litpools option cannot deal with literals
 used from multiple locations separated by more than 256 KBytes of code.
 Don't turn constants into literals, instead use MOVI instruction to load
 them into registers and let the assembler turn them into literals as
 necessary.

 If this is OK with the linux people, it is OK with me. As I recall,
 they used to have a need to keep literals in page-level groups, but my
 memory is hazy.

Text-section-literals remain available, and without auto-litpools option
code generation doesn't change at all. Even with auto-litpools option
literals will tend to pool at the beginning of functions, so code generation
shouldn't change much.

I've applied the patch to trunk.

-- 
Thanks.
-- Max


Re: [Patch] sext_hwi: Avoid left shift of negative value undefined behaviour

2015-08-12 Thread Jeff Law

On 08/12/2015 11:12 AM, Richard Biener wrote:



Prec is almost never a constant and is heavily used from wide-int.

We are not exploiting this undefined ness in C so I object to making this so 
much slower.

Can we instead do what we do for abs_hwi and add a checking assert so we can 
move the tests to the callers that misbehave instead?
Given that ISO C++ is moving away from making shifting 1 into the sign 
bit undefined behaviour, maybe we should make UBSan less strict in its 
warning.  That may eliminate the need for Mikael's patch.


jeff



C++ PATCH for c++/67161 (ICE with template-id)

2015-08-12 Thread Jason Merrill
When we're dealing with a template-id, all the arguments are 
non-default, so don't try to GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT.


Tested x86_64-pc-linux-gnu, applying to trunk and 5.
commit 15818891060dc29f108ebe5929b1b85250f143a7
Author: Jason Merrill ja...@redhat.com
Date:   Wed Aug 12 15:28:20 2015 +0100

	PR c++/67161
	* error.c (dump_decl) [TEMPLATE_ID_EXPR]: Pass
	TFF_NO_OMIT_DEFAULT_TEMPLATE_ARGUMENTS.

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 4f85751..ae3e092 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1212,7 +1212,8 @@ dump_decl (cxx_pretty_printer *pp, tree t, int flags)
 	if (args == error_mark_node)
 	  pp_string (pp, M_(template arguments error));
 	else if (args)
-	  dump_template_argument_list (pp, args, flags);
+	  dump_template_argument_list
+	(pp, args, flags|TFF_NO_OMIT_DEFAULT_TEMPLATE_ARGUMENTS);
   	pp_cxx_end_template_argument_list (pp);
   }
   break;
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ44.C b/gcc/testsuite/g++.dg/cpp1y/var-templ44.C
new file mode 100644
index 000..2fc21a5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ44.C
@@ -0,0 +1,29 @@
+// PR c++/67161
+// { dg-do compile { target c++14 } }
+
+template typename _Tp struct integral_constant {
+  static constexpr _Tp value = 0;
+};
+template bool, typename, typename struct conditional;
+template typename... struct __or_;
+template typename _B1, typename _B2
+struct __or__B1, _B2 : conditional1, _B1, _B2::type {};
+template typename... struct __and_;
+template typename struct __not_ : integral_constantbool {};
+template typename struct __is_void_helper : integral_constantbool {};
+template typename struct is_void : __is_void_helperint {};
+template bool, typename _Iftrue, typename struct conditional {
+  typedef _Iftrue type;
+};
+template bool _Cond, typename _Iftrue, typename _Iffalse
+using conditional_t = typename conditional_Cond, _Iftrue, _Iffalse::type;
+template typename... using common_type_t = int;
+template typename, int struct array {};
+template typename _Tp constexpr int is_void_v = is_void_Tp::value;
+template typename _Dest = void, typename... _Types
+constexpr auto make_array()
+- arrayconditional_tis_void_v_Dest, common_type_t, _Dest,
+ sizeof...(_Types) {
+  static_assert(__or___not_is_void_Dest, __and_::value, ); // { dg-error static assert }
+}
+auto d = make_array();


  1   2   >