date:20151123

Re: Devirtualization causing undefined symbol references at link?

2015-11-23 Thread Steven Noonan

On Tue, Nov 17, 2015 at 1:09 AM, Markus Trippelsdorf
 wrote:
> On 2015.11.16 at 14:18 -0800, Steven Noonan wrote:
>> Hi folks,
>>
>> (I'm not subscribed to the list, so please CC me on all responses.)
>>
>> This is using GCC 5.2 on Linux x86_64. On a project at work I've found
>> that one of our shared libraries refuses to link because of some
>> symbol references it shouldn't be making. If I add "-fno-devirtualize
>> -fno-devirtualize-speculatively" to the compile flags, the issue goes
>> away and everything links/runs fine. The issue does *not* appear on
>> GCC 4.8 (which is used by our current production toolchain).
>>
>> First of all, does anyone have any ideas off the top of their head why
>> devirtualization would break like this?
>>
>> Second, I'm looking for any ideas on how to gather meaningful data to
>> submit a useful bug report for this issue. The best idea I've come up
>> with so far is to preprocess one of the sources with the incorrect
>> references and use 'delta' to reduce it to a minimal preprocessed
>> source file that references one of these incorrect symbols.
>> Unfortunately this is a sluggish process because such a minimal test
>> case would need to compile correctly to an object file -- so "delta"
>> is reducing it very slowly. So far I'm down from 11MB preprocessed
>> source to 1.1MB preprocessed source after running delta a few times.
>
> These undefined references are normally user errors. For example, when
> you define an inline function, you need to link with the symbols it
> uses.
>
> markus@x4 /tmp % cat main.ii
> struct A {
>   void foo();
> };
> struct B {
>   A v;
>   virtual void bar() { v.foo(); }
> };
> struct C {
>   B *w;
>   void Test() {
> if (!w)
>   return;
> while (1)
>   w->bar();
>   }
> };
> C a;
> int main() { a.Test(); }
>
> markus@x4 /tmp % g++ -fno-devirtualize -O2 -Wl,--no-undefined main.ii
> markus@x4 /tmp % g++ -O2 -Wl,--no-undefined main.ii
> /tmp/ccEvh2dL.o:main.ii:function B::bar(): error: undefined reference to 
> 'A::foo()'
> /tmp/ccEvh2dL.o:main.ii:function main: error: undefined reference to 
> 'A::foo()'
> collect2: error: ld returned 1 exit status
>
> Instead of using delta you could try creduce instead. It is normally
> much quicker:
>
> https://github.com/csmith-project/creduce
>

creduce did make a much smaller test case, and it's actually sort of
readable. I'm not sure that I selected for the right criteria in my
test script though. It appears to exhibit the negative behavior we're
observing at least.

---
namespace panorama {
class A {
public:
  virtual int *AccessIUIStyle() = 0;
};
class CUIPanel : A {
  int *AccessIUIStyle() { return AccessStyle(); }
  int *AccessStyle() const;
};
class B {
  float GetSplitterPosition();
  A *m_pIUIPanel;
};
}
using namespace panorama;
float B::GetSplitterPosition() {
  m_pIUIPanel->AccessIUIStyle();
  return 0.0f;
}
---

The test case is very different from the code it came from (e.g.
Access*Style functions weren't returning int pointers before, and the
results were actually used at the GetSplitterPosition call site). But
I only selected for the undefined symbols in the 'nm' output. Might
need to refine the test script a bit more, but I would need to really
think about how...

Bad:
$ gcc -O3 -c debugger1.ii; nm debugger1.o
 T _ZN8panorama1B19GetSplitterPositionEv
 W _ZN8panorama8CUIPanel14AccessIUIStyleEv
 U _ZNK8panorama8CUIPanel11AccessStyleEv

Good:
$ gcc -O3 -fno-devirtualize -fno-devirtualize-speculatively -c
debugger1.ii; nm debugger1.o
 T _ZN8panorama1B19GetSplitterPositionEv
$ clang -O3 -c debugger1.ii; nm debugger1.o
 T _ZN8panorama1B19GetSplitterPositionEv

$ gcc --version
gcc (GCC) 5.2.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ clang --version
clang version 3.7.0 (tags/RELEASE_370/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix

[PATCH] Don't lower VEC_PERM_EXPR if it can be expanded using vec_shr optab (PR target/68483)

2015-11-23 Thread Jakub Jelinek

Hi!

The patches that removed VEC_RSHIFT_EXPR regressed the first of these
testcases on i?86/-msse2, because can_vec_perm_p returns false for that,
and indeed as can_vec_perm_p is given only the mode and mask indices,
there is nothing it can do about it.  The former VEC_RSHIFT_EXPR
is a special VEC_PERM_EXPR with zero (bitwise, so not -0.0) as second
argument and we can use vec_shr in that case.  The expander knows that, but
veclower hasn't been taught about that, which is what this patch does.

The patch also fixes up the shift_amt_for_vec_perm_mask function,
if the first index is >= nelt, then it certainly is not a vector shift, but
all zeros result (we should have folded it), plus when first is < nelt,
then it doesn't make sense to mask the result, even for first == nelt - 1
first + nelt - 1 is <= 2 * nelt - 1.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5.3?

2015-11-23  Jakub Jelinek  

PR target/68483
* tree-vect-generic.c (lower_vec_perm): If VEC_PERM_EXPR
is valid vec_shr pattern, don't lower it even if can_vec_perm_p
returns false.
* optabs.c (shift_amt_for_vec_perm_mask): Return NULL_RTX
whenever first is nelt or above.  Don't mask expected with
2 * nelt - 1.

* gcc.target/i386/pr68483-1.c: New test.
* gcc.target/i386/pr68483-2.c: New test.

--- gcc/tree-vect-generic.c.jj  2015-11-23 13:29:41.959236201 +0100
+++ gcc/tree-vect-generic.c 2015-11-23 14:13:10.378094173 +0100
@@ -1272,6 +1272,30 @@ lower_vec_perm (gimple_stmt_iterator *gs
  update_stmt (stmt);
  return;
}
+  /* Also detect vec_shr pattern - VEC_PERM_EXPR with zero
+vector as VEC1 and a right element shift MASK.  */
+  if (optab_handler (vec_shr_optab, TYPE_MODE (vect_type))
+ != CODE_FOR_nothing
+ && TREE_CODE (vec1) == VECTOR_CST
+ && initializer_zerop (vec1)
+ && sel_int[0]
+ && sel_int[0] < elements)
+   {
+ for (i = 1; i < elements; ++i)
+   {
+ unsigned int expected = i + sel_int[0];
+ /* Indices into the second vector are all equivalent.  */
+ if (MIN (elements, (unsigned) sel_int[i])
+ != MIN (elements, expected))
+   break;
+   }
+ if (i == elements)
+   {
+ gimple_assign_set_rhs3 (stmt, mask);
+ update_stmt (stmt);
+ return;
+   }
+   }
 }
   else if (can_vec_perm_p (TYPE_MODE (vect_type), true, NULL))
 return;
--- gcc/optabs.c.jj 2015-11-23 13:29:41.706239800 +0100
+++ gcc/optabs.c2015-11-23 13:33:14.857205132 +0100
@@ -5232,12 +5232,12 @@ shift_amt_for_vec_perm_mask (rtx sel)
 return NULL_RTX;
 
   first = INTVAL (CONST_VECTOR_ELT (sel, 0));
-  if (first >= 2*nelt)
+  if (first >= nelt)
 return NULL_RTX;
   for (i = 1; i < nelt; i++)
 {
   int idx = INTVAL (CONST_VECTOR_ELT (sel, i));
-  unsigned int expected = (i + first) & (2 * nelt - 1);
+  unsigned int expected = i + first;
   /* Indices into the second vector are all equivalent.  */
   if (idx < 0 || (MIN (nelt, (unsigned) idx) != MIN (nelt, expected)))
return NULL_RTX;
--- gcc/testsuite/gcc.target/i386/pr68483-1.c.jj2015-11-23 
14:27:54.213534756 +0100
+++ gcc/testsuite/gcc.target/i386/pr68483-1.c   2015-11-23 14:33:57.810362424 
+0100
@@ -0,0 +1,22 @@
+/* PR target/68483 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -msse2 -mno-sse3" } */
+
+void
+test (int *input, int *out, unsigned x1, unsigned x2)
+{
+  unsigned i, j;
+  unsigned end = x1;
+
+  for (i = j = 0; i < 1000; i++)
+{
+  int sum = 0;
+  end += x2;
+  for (; j < end; j++)
+   sum += input[j];
+  out[i] = sum;
+}
+}
+
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(8,|, 8)" { target ia32 } } } */
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" { target ia32 } } } */
--- gcc/testsuite/gcc.target/i386/pr68483-2.c.jj2015-11-23 
14:33:22.436865628 +0100
+++ gcc/testsuite/gcc.target/i386/pr68483-2.c   2015-11-23 14:34:33.716851638 
+0100
@@ -0,0 +1,15 @@
+/* PR target/68483 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-sse3" } */
+
+typedef int V __attribute__((vector_size (16)));
+
+void
+foo (V *a, V *b)
+{
+  V c = { 0, 0, 0, 0 };
+  V d = { 1, 2, 3, 4 };
+  *a = __builtin_shuffle (*b, c, d);
+}
+
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" } } */

Jakub

Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-23 Thread Jason Merrill


On 11/23/2015 12:07 PM, Marek Polacek wrote:

On Mon, Nov 23, 2015 at 05:57:54PM +0100, Jakub Jelinek wrote:

On Mon, Nov 23, 2015 at 11:53:40AM -0500, David Malcolm wrote:

Does the following look like the kind of thing you had in mind?  (just
the tree.def part for now).   Presumably usable for both lvalues and
rvalues, where the thing it wraps is what's important.  It merely exists
to add an EXPR_LOCATION, for a usage of the wrapped thing.


Yes, but please see with Jason, Richard and perhaps others if they are ok
with that too before spending too much time in that direction.
All occurrences of it would have to be folded away during the gimplification
at latest, this shouldn't be something we use in the middle-end.


I'd expect LOCATION_EXPR be defined in c-family/c-common.def, not tree.def.
And I'd think it shouldn't survive genericizing, thus never leak into the ME.


Makes sense.

Jason

[Bug tree-optimization/68493] [6 Regression] [graphite] ICE in copy_loop_phi_args

2015-11-23 Thread spop at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68493

Sebastian Pop  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Sebastian Pop  ---
fixed in r230772.

[Bug middle-end/68279] ICE: in create_pw_aff_from_tree, at graphite-sese-to-poly.c:836

2015-11-23 Thread spop at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279

Sebastian Pop  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Sebastian Pop  ---
Fixed in r230771

[Bug target/67808] LRA ICEs on simple double to long double conversion test case

2015-11-23 Thread meissner at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67808

--- Comment #4 from Michael Meissner  ---
Author: meissner
Date: Mon Nov 23 19:25:32 2015
New Revision: 230769

URL: https://gcc.gnu.org/viewcvs?rev=230769=gcc=rev
Log:
[gcc]
2015-11-23  Michael Meissner  

Backport from mainline
2015-10-05  Michael Meissner  
Peter Bergner  

PR target/67808
* config/rs6000/rs6000.md (extenddftf2): In the expander, only
allow registers, but provide insns for the combiner to create for
loads from memory. Separate VSX code from non-VSX code. For
non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename
externaldftf2_internal to externaldftf2_fprs. Reorder constraints
so that registers come before memory operations. Drop support from
converting DFmode to TFmode, if the DFmode value is in a GPR
register.
(extenddftf2_fprs): Likewise.
(extenddftf2_internal): Likewise.
(extenddftf2_vsx): Likewise.
(extendsftf2): In the expander, only allow registers, but provide
insns for the combiner to create for stores and loads.

[gcc/testsuite]
2015-11-23  Michael Meissner  

2015-10-05  Michael Meissner  
Peter Bergner 

PR target/67808
* gcc.target/powerpc/pr67808.c: New test.


Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/powerpc/pr67808.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/rs6000/rs6000.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/59828] Broken assembly on ppc* with two -mcpu= options

2015-11-23 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59828

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||segher at gcc dot gnu.org
 Resolution|--- |FIXED
   Target Milestone|5.3 |6.0

--- Comment #5 from Segher Boessenkool  ---
Fixed; no backport to 5 planned.

[Bug sanitizer/59302] tsan: Unexpected mmap in InternalAllocator!

2015-11-23 Thread Joost.VandeVondele at mat dot ethz.ch

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59302

Joost VandeVondele  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Joost VandeVondele  
---
I suppose this is fixed by now, haven't seen it again.

Re: [PATCH] New version of libmpx with new memmove wrapper

2015-11-23 Thread Aleksandra Tsvetkova

gcc/testsuite/ChangeLog
+2015-10-27  Tsvetkova Alexandra  
+
+ * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.

libmpx/ChangeLog
+2015-10-28  Tsvetkova Alexandra  
+
+ * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
+ * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
+ * libmpx/Makefile.in: Regenerate.
+ * mpxrt/Makefile.in: Regenerate.
+ * libmpxwrap/Makefile.in: Regenerate.
+ * mpxrt/libtool-version: New version.
+ * libmpxwrap/libtool-version: Likewise.
+ * mpxrt/libmpx.map: Add new version and a new symbol.
+ * mpxrt/mpxrt.h: New file.
+ * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+(REG_IP_IDX): Moved to mpxrt.h.
+(REX_PREFIX): Moved to mpxrt.h.
+(XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+(MPX_L1_SIZE): Moved to mpxrt.h.
+ * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove
+ to make it faster.
+ New types: mpx_pointer for extraction of indexes from pointer
+   mpx_bt_entry represents a cell in bounds table.
+ New functions: alloc_bt for allocatinn bounds table
+   get_bt to get address of bounds table
+   copy_if_possible and copy_if_possible_from_end move elements
+   of bounds table if we can
+   move_bounds moves bounds just like memmove


All fixed except for:

>>+static inline void
>>+alloc_bt (void *ptr)
>>+{
>>+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
>>+}
>
>This should be marked as bnd_legacy.

It will not work.

> +void *
> +__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
> +{
> +  if (n == 0)
> +return dst;
> +
> +  __bnd_chk_ptr_bounds (dst, n);
> +  __bnd_chk_ptr_bounds (src, n);
> +
> +  memmove (dst, src, n);
> +  move_bounds (dst, src, n);
> +  return dst;
>  }
>
> You completely remove old algorithm which should be faster on small
> sizes. __mpx_wrapper_memmove should become a dispatcher between old
> and new implementations depending on target (32-bit or 64-bit) and N.
> Since old version performs both data and bounds copy, BD check should
> be moved into __mpx_wrapper_memmove to never call
> it when MPX is disabled.

Even though the old algorithm is faster on small sizes, it should not be used
with the new one because the new one supports unaligned pointers and the
old one does not. Different behavior may cause more problems.

Thanks,
Alexandra.
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c 
b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include 
+#include 
+#include 
+#include 
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+  int bd_index_end, int src_bt_index_end, int pointers_inside,
+  int src_align, int dst_align)
+{
+  const int n =
+src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+{
+  return 0;
+}
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (),
+   1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+   num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+{
+  for (int i = 0; i < n; i++)
+src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+}
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+{
+

[Bug target/67071] GCC misses an optimization to load vector constants

2015-11-23 Thread dje at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67071

David Edelsohn  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||dje at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #3 from David Edelsohn  ---
Fixed.

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread David Malcolm

On Mon, 2015-11-23 at 10:25 -0700, Jeff Law wrote:
> On 11/23/2015 04:13 AM, Joseph Myers wrote:
> > On Sun, 22 Nov 2015, David Malcolm wrote:
> >
> >> Is there (or could there be) a precanned dg- directive to ask if ObjC is
> >> available?
> >
> > I don't think so.  Normal practice is that each language's tests are in
> > appropriate directories for that language, with runtest never called with
> > a --tool option for that language if it wasn't built.
> Right.  Which argues that we really want to create a new test directory 
> for objc plugin tests.

Attached is a revised version of the patch which creates an
objc.dg/plugin subdirectory, and builds the plugin that way (directly
reusing the plugin src from the gcc.dg subdir).

Successfully bootstrapped on x86_64-pc-linux-gnu; adds 16
PASS results to objc.sum.

OK for trunk?

>From f09c48b2ac55b2f9b5c3688e76fb4b91c3325fbb Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Fri, 20 Nov 2015 11:12:47 -0500
Subject: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

gcc/c/ChangeLog:
	PR objc/68438
	* c-parser.c (c_parser_postfix_expression): Set up source ranges
	for various Objective-C constructs: Class.name syntax,
	@selector(), @protocol, @encode(), and [] message syntax.

gcc/testsuite/ChangeLog:
	PR objc/68438
	* objc.dg/plugin/diagnostic-test-expressions-1.m: New test file.
	* objc.dg/plugin/plugin.exp: New file, based on
	gcc.dg/plugin/plugin.exp.
---
 gcc/c/c-parser.c   | 17 +++-
 .../objc.dg/plugin/diagnostic-test-expressions-1.m | 94 ++
 gcc/testsuite/objc.dg/plugin/plugin.exp| 90 +
 3 files changed, 198 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
 create mode 100644 gcc/testsuite/objc.dg/plugin/plugin.exp

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7b10764..18e9957 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7338,10 +7338,13 @@ c_parser_postfix_expression (c_parser *parser)
 		expr.value = error_mark_node;
 		break;
 	  }
-	component = c_parser_peek_token (parser)->value;
+	c_token *component_tok = c_parser_peek_token (parser);
+	component = component_tok->value;
+	location_t end_loc = component_tok->get_finish ();
 	c_parser_consume_token (parser);
 	expr.value = objc_build_class_component_ref (class_name, 
 			 component);
+	set_c_expr_source_range (, loc, end_loc);
 	break;
 	  }
 	default:
@@ -7816,9 +7819,11 @@ c_parser_postfix_expression (c_parser *parser)
 	}
 	  {
 	tree sel = c_parser_objc_selector_arg (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_selector_expr (loc, sel);
+	set_c_expr_source_range (, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_PROTOCOL:
@@ -7839,9 +7844,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  {
 	tree id = c_parser_peek_token (parser)->value;
 	c_parser_consume_token (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_protocol_expr (id);
+	set_c_expr_source_range (, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_ENCODE:
@@ -7860,11 +7867,13 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
 	  break;
 	}
-	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
- "expected %<)%>");
 	  {
+	location_t close_loc = c_parser_peek_token (parser)->location;
+	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
+ "expected %<)%>");
 	tree type = groktypename (t1, NULL, NULL);
 	expr.value = objc_build_encode_expr (type);
+	set_c_expr_source_range (, loc, close_loc);
 	  }
 	  break;
 	case RID_GENERIC:
@@ -7907,9 +7916,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_consume_token (parser);
 	  receiver = c_parser_objc_receiver (parser);
 	  args = c_parser_objc_message_args (parser);
+	  location_t close_loc = c_parser_peek_token (parser)->location;
 	  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
  "expected %<]%>");
 	  expr.value = objc_build_message_expr (receiver, args);
+	  set_c_expr_source_range (, loc, close_loc);
 	  break;
 	}
   /* Else fall through to report error.  */
diff --git a/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m b/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
new file mode 100644
index 000..ed7aca3
--- /dev/null
+++ b/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This file is similar to diagnostic-test-expressions-1.c
+   (see the

Re: [PATCH 5/6] Fix parser memory leak in cilk_simd_fn_info

2015-11-23 Thread Jeff Law


On 11/23/2015 06:48 AM, marxin wrote:

gcc/cp/ChangeLog:

2015-11-23  Martin Liska  

* parser.c (cp_parser_late_parsing_cilk_simd_fn_info):
Release tokens.
There's a vec of objects in cilk_simd_fn_info, so unless that vec is 
copied elsewhere, we definitely want to release them before we blow away 
parser->cilk_simd_fn_info.  AFAICT the vec is never copied elsewhere.  So...


OK for the trunk.

jeff

[Bug c++/55077] implement and enable by default -Wliteral-conversion

2015-11-23 Thread dcb314 at hotmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55077

--- Comment #6 from David Binderman  ---
(In reply to Manuel López-Ibáñez from comment #5)
> Created attachment 33637 [details]
> untested patch
> 
> Untested patch. Bonus points if we show the value before and after
> conversion like clang does.

I tried out the patch by building a test compiler and
then building the current Linux kernel. 

Trivial problem with doubled tokens (&& &&), but the main problem is that the
patch doesn't fix the problem I described of double and float literals
into integral types.

Also, plenty of false positives for integer into smaller integral types,
for example integer into short or integer into char. 

For example, the patch warns for this reasonable code:

 charVariable = 0x80;

As is, I would not use the patch, but maybe with some further work,
it might be suitable for future use.

[Bug go/68496] [libgo] reflect test fails on Linux x86-64

2015-11-23 Thread ian at airs dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68496

--- Comment #1 from Ian Lance Taylor  ---
I can not recreate this problem.  It works fine for me.

The stack trace is incomplete for some reason so I don't know what is going
wrong.

If you cd into x86_64-pc-linux-gnu/libgo, you can run
make GOTESTFLAGS="--keep" reflect/check
Presumably that will fail.  It will leave behind a gotest directory.  In
that directory you will find an a.out executable.  Running that executable runs
the test (you may have to set LD_LIBRARY_PATH so that it finds libgo.so).  Try
running that executable under gdb and see if you can get a better backtrace.

Re: [PATCH] PR c/68473: sanitize source range-printing within certain macro expansions

2015-11-23 Thread David Malcolm

On Mon, 2015-11-23 at 18:59 +0100, Bernd Schmidt wrote:
> On 11/23/2015 06:52 PM, David Malcolm wrote:
> > This patch fixes PR c/68473 by bulletproofing the new
> > diagnostic_show_locus implementation against ranges that finish before
> > they start (which can happen when using the C preprocessor), falling
> > back to simply printing a caret.
> 
> Hmm, wouldn't it be better to avoid such a situation? Can you describe a 
> bit more how exactly the macro expansion caused such a situation?

The issue is here:

 1  /* { dg-options "-fdiagnostics-show-caret -mno-fp-ret-in-387" } */
 2  
 3  extern long double fminl (long double __x, long double __y);
 4  
 5  #define TEST_EQ(FUNC) do { \
 6if ((long)FUNC##l(xl,xl) != (long)xl) \
 7  return; \
 8} while (0)
 9  
10  void
11  foo (long double xl)
12  {
13TEST_EQ (fmin); /* { dg-error "x87 register return with x87 disabled" 
} */
14  }


16  /* { dg-begin-multiline-output "" }
17 TEST_EQ (fmin);
18  ^
19 { dg-end-multiline-output "" } */
20  
21  /* { dg-begin-multiline-output "" }
22 if ((long)FUNC##l(xl,xl) != (long)xl) \
23   ^~~~
24 { dg-end-multiline-output "" } */

An error is emitted whilst expanding the macro at line 13, at
input_location.

This is at the expansion of this function call:

   fminl (xl, xl)

Normally we'd emit a source range like this for a function call:

   fminl (xl, xl)
   ^~

However, once we fully resolve locations, the "fmin" part of "fminl"
appears at line 13 here:

13TEST_EQ (fmin);
   ^~~~

giving the location of the caret, and start of the range, whereas the
rest of the the call is spelled here:

 6if ((long)FUNC##l(xl,xl) != (long)xl) \
   ~~~

where the close paren gives the end of the range.

It would be wrong to try to print the whole range (anything might be
between lines 6 and 13).

In theory we could attempt to try to handle this kind of thing by
looking at the macro expansions, and to print something like:

13TEST_EQ (fmin);
   ^~~~
 6if ((long)FUNC##l(xl,xl) != (long)xl) \
  

or whatnot, but that strikes me as error-prone at this stage.


The patch instead detects such a situation, and tries to handle things
gracefully by falling back to simply printing a caret, without any
underlines:

pr68473-1.c: In function ‘foo’:
pr68473-1.c:13:12: error: x87 register return with x87 disabled
   TEST_EQ (fmin);
^

pr68473-1.c:6:13: note: in definition of macro ‘TEST_EQ’
   if ((long)FUNC##l(xl,xl) != (long)xl) \
 ^~~~


Dave

[Bug c++/67550] [5/6 regression] Initialization of local struct array with elements of global array yields zeros instead of initializer values

2015-11-23 Thread jwyatt at feralinteractive dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67550

--- Comment #5 from Jason Wyatt  ---
When parsing the initialisation of const TestStruct var:
  store_init_value ends up calling split_nonconstant_init, so that only the
constant part of the initialisation of var is stored in DECL_INITIAL(t).

Then when parsing the initialisation of the array, maybe_constant_init
eventually calls through to decl_constant_value(t), i.e. constant_value_1(t,
false, true). That then uses DECL_INITIAL(t) as if it were the whole
initialisation, rather than just the constant part, hence all the non constant
parts aren't initialised correctly.

I've got no idea how this is supposed to work though, presumably at some point
in the chain of calls it's supposed to realise this is not a constant?

[Bug go/68496] [libgo] reflect test fails on Linux x86-64

2015-11-23 Thread ismail at i10z dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68496

--- Comment #2 from İsmail Dönmez  ---
(In reply to Ian Lance Taylor from comment #1)
> I can not recreate this problem.  It works fine for me.
> 
> The stack trace is incomplete for some reason so I don't know what is going
> wrong.
> 
> If you cd into x86_64-pc-linux-gnu/libgo, you can run
> make GOTESTFLAGS="--keep" reflect/check
> Presumably that will fail.  It will leave behind a gotest directory.  In
> that directory you will find an a.out executable.  Running that executable
> runs the test (you may have to set LD_LIBRARY_PATH so that it finds
> libgo.so).  Try running that executable under gdb and see if you can get a
> better backtrace.

Try export MALLOC_CHECK_=2 before testing. Backtrace shows an invalid free:

(gdb) bt
#0  0x7f8d36c51d38 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:55
#1  0x7f8d36c5318a in __GI_abort () at abort.c:78
#2  0x7f8d36c960b0 in malloc_printerr (action=,
str=0x7f8d36d840fa "free(): invalid pointer", ptr=, ar_ptr=) at malloc.c:5004
#3  0x00433298 in reflect.call.N13_reflect.Value
(pointer=pointer@entry=0xc2080c56a0, op=..., param=...) at value.go:4
50
#4  0x004328d4 in reflect.Call.N13_reflect.Value
(pointer=pointer@entry=0x7f8d33da0c40, in=...) at value.go:300
#5  0x0044ec3a in reflect_test.TestCallWithStruct (t=)
at all_test.go:1492
#6  0x7f8d37e6f71c in testing.tRunner (test=0xc20807a318, param=) at ../../../libgo/go/testing/testing.go:4
55
#7  testing.$thunk15 (__go_thunk_parameter=) at
../../../libgo/go/testing/testing.go:560
#8  0x7f8d37d8fa9c in kickoff () at ../../../libgo/runtime/proc.c:235
#9  0x7f8d3812a135 in __morestack () at
../../../libgcc/config/i386/morestack.S:544
#10 0x7f8d36c62900 in ?? () from /lib64/libc.so.6
#11 0x in ?? ()

Re: update zlib to 1.2.8

2015-11-23 Thread Joel Brobecker

> In GCC zlib is only used for libjava; for binutils and gdb it is used when
> building without --with-system-zlib.  This just updates zlib from 1.2.7 to
> 1.2.8 (released in 2013).  Applies cleanly, libjava still builds and doesn't
> show any regressions in the testsuite.  Ok to apply (even if we already are
> in stage3)?

> +2015-11-23  Matthias Klose  
> +
> +   * Imported zlib 1.2.8; merged local changes.

Should not be a problem for GDB, since we're not near branching time.

Out of curiosity, what prompted this update? Just to be in sync with
the latest? Or was there an actual bug that you hit which 1.2.8 fixes?

-- 
Joel

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Jan Hubicka

BTW for the LTO type merging issues one could probably just drop those types
and all derivations to alias set 0. But indeed rewriting them to pointers would
be better, especially for ABI compatibility.

The Ada ICE I get is:
Continuing.
+===GNAT BUG DETECTED==+
| 6.0.0 20151122 (experimental) (x86_64-pc-linux-gnu) Assert_Failure 
atree.adb:6776|
| Error detected at system.ads:107:4   |
| Please submit a bug report; see http://gcc.gnu.org/bugs.html.|
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.
Consider also -gnatd.n switch (see debug.adb).

../../gcc/ada/system.ads
../../gcc/ada/a-except.adb
../../gcc/ada/a-except.ads
../../gcc/ada/ada.ads
../../gcc/ada/s-parame.ads
../../gcc/ada/s-stalib.ads
../../gcc/ada/a-unccon.ads
../../gcc/ada/s-traent.ads
../../gcc/ada/s-excdeb.ads
../../gcc/ada/s-soflin.ads
../../gcc/ada/s-stache.ads
../../gcc/ada/s-stoele.ads

compilation abandoned

(gdb) bt
#0  atree__unchecked_access__set_flag96.part.697.lto_priv.6676 () at 
../../gcc/ada/atree.adb:6776
#1  0x01711774 in atree__unchecked_access__set_flag96 (n=, val=) at ../../gcc/ada/atree.adb:6774
#2  0x0126a95c in einfo.set_warnings_off (v=, id=0) at 
../../gcc/ada/einfo.adb:6435
#3  sem_prag.analyze_pragma () at ../../gcc/ada/sem_prag.adb:22879
#4  0x00989893 in sem.analyze (n=12466) at ../../gcc/ada/sem.adb:456
#5  0x00cac089 in sem_ch3.analyze_declarations (l=-8775) at 
../../gcc/ada/sem_ch3.adb:2323
#6  0x0134e4d5 in sem_ch7.analyze_package_specification () at 
../../gcc/ada/sem_ch7.adb:1395
#7  0x009898ab in sem.analyze (n=12078) at ../../gcc/ada/sem.adb:450
#8  0x013517d8 in sem_ch7.analyze_package_declaration (n=12875) at 
../../gcc/ada/sem_ch7.adb:1006
#9  0x00989e89 in sem.analyze (n=n@entry=12875) at 
../../gcc/ada/sem.adb:441
#10 0x00998d6d in sem_ch10.analyze_compilation_unit (n=n@entry=12067) 
at ../../gcc/ada/sem_ch10.adb:892
#11 0x00989947 in sem.analyze (n=n@entry=12067) at 
../../gcc/ada/sem.adb:174
#12 0x0099760f in sem.semantics.do_analyze () at 
../../gcc/ada/sem.adb:1337
#13 sem.semantics () at ../../gcc/ada/sem.adb:1517
#14 0x00998039 in sem_ch10.analyze_with_clause (n=n@entry=2286) at 
../../gcc/ada/sem_ch10.adb:2540
#15 0x00989a7f in sem.analyze (n=n@entry=2286) at 
../../gcc/ada/sem.adb:601
#16 0x00991e67 in sem_ch10.analyze_context (n=n@entry=2284) at 
../../gcc/ada/sem_ch10.adb:1371
#17 0x00998cb0 in sem_ch10.analyze_compilation_unit (n=n@entry=2284) at 
../../gcc/ada/sem_ch10.adb:686
#18 0x00989947 in sem.analyze (n=n@entry=2284) at 
../../gcc/ada/sem.adb:174
#19 0x0099760f in sem.semantics.do_analyze () at 
../../gcc/ada/sem.adb:1337
#20 sem.semantics () at ../../gcc/ada/sem.adb:1517
#21 0x0090e5f9 in frontend () at ../../gcc/ada/frontend.adb:408
#22 0x0146de0a in _ada_gnat1drv () at ../../gcc/ada/gnat1drv.adb:1029
#23 0x006f579e in gnat_parse_file() [clone .lto_priv.5151] () at 
../../gcc/ada/gcc-interface/misc.c:121
#24 0x016f723c in compile_file () at ../../gcc/toplev.c:464
#25 0x0068996e in do_compile () at ../../gcc/toplev.c:1951
#26 toplev::main (this=this@entry=0x7fffe850, argc=argc@entry=39, 
argv=argv@entry=0x7fffe958) at ../../gcc/toplev.c:2058
#27 0x00688e29 in main (argc=39, argv=0x7fffe958) at 
../../gcc/main.c:39

If you have any clue how to debug it further, I would be happy to try.
That atree code is real software engineering treat BTW

Honza

[Bug target/36358] -mvrsave / -mno-vrsave ignored

2015-11-23 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36358

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||segher at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #3 from Segher Boessenkool  ---
This is long fixed.  Thanks for the report.

Re: Devirtualization causing undefined symbol references at link?

2015-11-23 Thread Markus Trippelsdorf

On 2015.11.23 at 11:11 -0800, Steven Noonan wrote:
> On Tue, Nov 17, 2015 at 1:09 AM, Markus Trippelsdorf
>  wrote:
> > On 2015.11.16 at 14:18 -0800, Steven Noonan wrote:
> >> Hi folks,
> >>
> >> (I'm not subscribed to the list, so please CC me on all responses.)
> >>
> >> This is using GCC 5.2 on Linux x86_64. On a project at work I've found
> >> that one of our shared libraries refuses to link because of some
> >> symbol references it shouldn't be making. If I add "-fno-devirtualize
> >> -fno-devirtualize-speculatively" to the compile flags, the issue goes
> >> away and everything links/runs fine. The issue does *not* appear on
> >> GCC 4.8 (which is used by our current production toolchain).
> >>
> >> First of all, does anyone have any ideas off the top of their head why
> >> devirtualization would break like this?
> >>
> >> Second, I'm looking for any ideas on how to gather meaningful data to
> >> submit a useful bug report for this issue. The best idea I've come up
> >> with so far is to preprocess one of the sources with the incorrect
> >> references and use 'delta' to reduce it to a minimal preprocessed
> >> source file that references one of these incorrect symbols.
> >> Unfortunately this is a sluggish process because such a minimal test
> >> case would need to compile correctly to an object file -- so "delta"
> >> is reducing it very slowly. So far I'm down from 11MB preprocessed
> >> source to 1.1MB preprocessed source after running delta a few times.
> >
> > These undefined references are normally user errors. For example, when
> > you define an inline function, you need to link with the symbols it
> > uses.
> >
> > markus@x4 /tmp % cat main.ii
> > struct A {
> >   void foo();
> > };
> > struct B {
> >   A v;
> >   virtual void bar() { v.foo(); }
> > };
> > struct C {
> >   B *w;
> >   void Test() {
> > if (!w)
> >   return;
> > while (1)
> >   w->bar();
> >   }
> > };
> > C a;
> > int main() { a.Test(); }
> >
> > markus@x4 /tmp % g++ -fno-devirtualize -O2 -Wl,--no-undefined main.ii
> > markus@x4 /tmp % g++ -O2 -Wl,--no-undefined main.ii
> > /tmp/ccEvh2dL.o:main.ii:function B::bar(): error: undefined reference to 
> > 'A::foo()'
> > /tmp/ccEvh2dL.o:main.ii:function main: error: undefined reference to 
> > 'A::foo()'
> > collect2: error: ld returned 1 exit status
> >
> > Instead of using delta you could try creduce instead. It is normally
> > much quicker:
> >
> > https://github.com/csmith-project/creduce
> >
> 
> creduce did make a much smaller test case, and it's actually sort of
> readable. I'm not sure that I selected for the right criteria in my
> test script though. It appears to exhibit the negative behavior we're
> observing at least.
> 
> ---
> namespace panorama {
> class A {
> public:
>   virtual int *AccessIUIStyle() = 0;
> };
> class CUIPanel : A {
>   int *AccessIUIStyle() { return AccessStyle(); }
>   int *AccessStyle() const;
> };
> class B {
>   float GetSplitterPosition();
>   A *m_pIUIPanel;
> };
> }
> using namespace panorama;
> float B::GetSplitterPosition() {
>   m_pIUIPanel->AccessIUIStyle();
>   return 0.0f;
> }

Yes. It is the same issue that I've pointed out in my example above.
You need to either link with the object file that provides the
_ZNK8panorama8CUIPanel11AccessStyleEv symbol. Or move the definition of 
panorama::CUIPanel::AccessIUIStyle() to the file that defines
panorama::CUIPanel::AccessStyle().

-- 
Markus

[Bug middle-end/68314] [6 Regression] Invalid read in build_pbb_minimal_scattering_polyhedrons (graphite-sese-to-poly.c:148)

2015-11-23 Thread spop at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68314

--- Comment #2 from Sebastian Pop  ---
This patch exposes the problem without valgrind:

diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 2054fad..b932dae 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -143,6 +143,9 @@ build_pbb_minimal_scattering_polyhedrons (isl_aff
*static_sched, poly_bb_p pbb,
  /* False for loop dimension.  */
  sequence_and_loop_dims[i + j] = false;
}
+
+  gcc_assert (nb_sequence_dim > j);
+
   /* Fake loops make things shifted by one.  */
   if (sequence_dims && sequence_dims[j] == i)
sequence_and_loop_dims[i + j] = true;

[Bug c++/68476] microblaze: compilation of btSoftBody.cpp doesn't terminate with optimisation

2015-11-23 Thread arnout at mind dot be

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68476

--- Comment #2 from Arnout Vandecappelle  ---
Created attachment 36813
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36813=edit
Preprocessed source file that exposes the bug

Attached preprocessed source file.

Compilation output (with gcc 4.9.3; I also tried with 5.2.0 with the same
result):

$
../../../../host/opt/ext-toolchain/bin/microblazeel-buildroot-linux-gnu-g++.br_real
  -O1 -g -c -v btSoftBody.ii
Using built-in specs.
COLLECT_GCC=../../../../host/opt/ext-toolchain/bin/microblazeel-buildroot-linux-gnu-g++.br_real
Target: microblazeel-buildroot-linux-gnu
Configured with: ./configure
--prefix=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr
--sysconfdir=/opt/br-microblaze-full-2015.08-647-gc356fb2/etc --enable-static
--target=microblazeel-buildroot-linux-gnu
--with-sysroot=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr/microblazeel-buildroot-linux-gnu/sysroot
--disable-__cxa_atexit --with-gnu-ld --disable-libssp --disable-multilib
--with-gmp=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr
--with-mpfr=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr
--with-pkgversion='Buildroot 2015.11-git-00647-gc356fb2'
--with-bugurl=http://bugs.buildroot.net/ --disable-libquadmath --enable-tls
--disable-libmudflap --enable-threads
--with-mpc=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr --without-isl
--without-cloog --disable-decimal-float --enable-languages=c,c++
--with-build-time-tools=/opt/br-microblaze-full-2015.08-647-gc356fb2/usr/microblazeel-buildroot-linux-gnu/bin
--enable-shared --disable-libgomp
Thread model: posix
gcc version 4.9.3 (Buildroot 2015.11-git-00647-gc356fb2) 
COLLECT_GCC_OPTIONS='-O1' '-g' '-c' '-v' '-shared-libgcc'

/gentoo/home2/arnout/br-out/bullet/host/opt/ext-toolchain/bin/../libexec/gcc/microblazeel-buildroot-linux-gnu/4.9.3/cc1plus
-fpreprocessed btSoftBody.ii -quiet -dumpbase btSoftBody.ii -auxbase btSoftBody
-g -O1 -version -o /tmp/ccHn1vcs.s
GNU C++ (Buildroot 2015.11-git-00647-gc356fb2) version 4.9.3
(microblazeel-buildroot-linux-gnu)
compiled by GNU C version 4.4.5, GMP version 6.0.0, MPFR version 3.1.3,
MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++ (Buildroot 2015.11-git-00647-gc356fb2) version 4.9.3
(microblazeel-buildroot-linux-gnu)
compiled by GNU C version 4.4.5, GMP version 6.0.0, MPFR version 3.1.3,
MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: a3940dfdf166fdf4d5b4711c8bdc0b77

[RFA] [PATCH] Fix invalid redundant extension elimination for rl78 port

2015-11-23 Thread Jeff Law



The core analysis was from Nick.  Essentially:


(insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn  45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
[...]
(insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn  88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))

  (This is on the RL78 target where HImode values occupy two hard
  registers and QImode values only one.  The bug however is generic, not
  RL78 specific).

  The REE pass transforms this into:

(insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn  45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18
[...]
(insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn  88 deleted)

  Note how the new set at insn 45 clobbers the value loaded by insn 44
  into r11.  Thus when we use the value in insn 71, we're using the
  wrong value.


Nick had a more complex patch which tried to determine if the additional 
hard registers were used/set.  But the implementation was flawed in that 
it assumed the use succeeded the def in the linear insn chain, which is 
an invalid assumption in general.  For this to work what we'd really 
have to do is note all the blocks through which there's a path from the 
def to the use, then check for uses/sets within all those blocks.


Given this scenario is quite rare, it doesn't seem worth the effort. 
Even with an abort in the codepath, I can't get it to trigger during 
normal x86_64 or rl78 builds.  It only triggers on the rl78 with -O1 -free.


As I mentioned in a prior message on the subject, this is only a problem 
when the source/dest of the extension are the same.  When the 
source/dest of the extension are different, we only optimize when the 
original set and extension are in the same block and we verify that all 
affected registers are not set/used between the original set and the 
extension.
Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested 
execute.exp on rl78 with no regressions.


I didn't include a distinct testcase as these are covered by pr42833 and 
strct-stdarg-1.c -- but only when those are run with -O1 -free.  I can 
certainly add a -free test for those tests if folks want.


I took this opportunity to also remove a block of #if 0'd code that I 
had in place for this situation, but had been unable to trigger.  I 
prefer Nick's location for the test.


Ok for the trunk?



Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e560746..29ed4e4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2015-11-18  Nick Clifton  
+   Jeff Law  
+
+   * ree.c (add_removable_extension): Avoid mis-optimizing cases where
+   the source/dest of the target extension require a different number of
+   hard registers.
+   (combine_set_extension): Remove #if 0 code.
+
 2015-11-20  Jim Wilson  
 
* tree-vect-data-refs.c (compare_tree): Call STRIP_NOPS.
diff --git a/gcc/ree.c b/gcc/ree.c
index b8436f2..f3b79e0 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -332,16 +332,6 @@ combine_set_extension (ext_cand *cand, rtx_insn 
*curr_insn, rtx *orig_set)
   else
 new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
 
-#if 0
-  /* Rethinking test.  Temporarily disabled.  */
-  /* We're going to be widening the result of DEF_INSN, ensure that doing so
- doesn't change the number of hard registers needed for the result.  */
-  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
-  != HARD_REGNO_NREGS (REGNO (SET_DEST (*orig_set)),
-  GET_MODE (SET_DEST (*orig_set
-   return false;
-#endif
-
   /* Merge constants by directly moving the constant into the register under
  some conditions.  Recall that RTL constants are sign-extended.  */
   if (GET_CODE (orig_src) == CONST_INT
@@ -1080,6 +1070,18 @@ add_removable_extension (const_rtx expr, rtx_insn *insn,
  }
  }
 
+  /* Fourth, if the extended version occupies more registers than the
+original and the source of the extension is the same hard register
+as the destination of the extension, then we can not eliminate
+the extension without deep analysis, so just punt.
+
+We allow this when the registers are different because the
+code in combine_reaching_defs will handle that case correctly.  */
+  if ((HARD_REGNO_NREGS (REGNO (dest), mode)
+  != HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
+ && REGNO (dest) == REGNO (reg))
+   return;
+
   /* Then add the candidate to the list and insert the reaching definitions
  into the definition map.  */
   ext_cand e = {expr, code, mode, insn};

[ptx] Fix sso tests

2015-11-23 Thread Nathan Sidwell

The gcc.dg/sso tests gratuitously fail on PTX because they use IO facilities 
that don't exist there.  This  patch changes the dumping to use the putchar 
function call (and not a macro), and not use fputs.


With this they all pass.

I'm not quite sure where the maintainer  boundaries lie for this kind of fix. 
Any objections?


nathan
2015-11-23  Nathan Sidwell  

	* gcc.dg/sso/dump.h: Force IO to be putchar function call on nvptx.

Index: gcc/testsuite/gcc.dg/sso/dump.h
===
--- gcc/testsuite/gcc.dg/sso/dump.h	(revision 230718)
+++ gcc/testsuite/gcc.dg/sso/dump.h	(working copy)
@@ -1,3 +1,9 @@
+#ifdef __nvptx__
+/* Force function call.  NVPTX's IO is extremely limited.  */
+#undef putchar
+#define putchar (putchar)
+#endif
+
 void dump (void *p, unsigned int len)
 {
   const char digits[17] = "0123456789abcdef";
@@ -14,7 +20,13 @@ void dump (void *p, unsigned int len)
 
 void put (const char s[])
 {
+#ifdef  __nvptx__
+  int i;
+  for (i = 0; s[i]; i++)
+putchar (s[i]);
+#else
   fputs (s, stdout);
+#endif
 }
 
 void new_line (void)

[Bug target/33236] -mminimal-toc register should be psedu-register

2015-11-23 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33236

Segher Boessenkool  changed:

   What|Removed |Added

   Last reconfirmed|2007-08-30 21:15:01 |2015-11-23
 CC||segher at gcc dot gnu.org

--- Comment #4 from Segher Boessenkool  ---
Still happens.

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-23 Thread jvdelisle at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #24 from Jerry DeLisle  ---
(In reply to Jerry DeLisle from comment #16)
> For what its worth:
> 
> $ gfc pr51119.f90 -lblas -fno-external-blas -Ofast -march=native 
> $ ./a.out 
>  Time, MATMUL:21.2483196   21.25444964601 1.5055670945599979
> 
>  Time, dgemm:33.2441711   33.24308728902  .96260614189671445
> 

Running a sample matrix multiply program on this same platform using the
default OpenCL (Mesa on Fedora 22) the machine is achieving:

64 x 64  2.76 Gflops
1000 x 1000  14.10
2000 x 2000  24.4

Re: [PATCH 1/6] Fix memory leak in cilk

2015-11-23 Thread Trevor Saunders

> diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c
> index e75e20c..1167b2b 100644
> --- a/gcc/c-family/cilk.c
> +++ b/gcc/c-family/cilk.c
> @@ -844,6 +844,7 @@ gimplify_cilk_spawn (tree *spawn_p)
>   call2, build_empty_stmt (EXPR_LOCATION (call1)));
>append_to_statement_list (spawn_expr, spawn_p);
>  
> +  free (arg_array);

seems like arg_array could just be made an auto_vec, but I guess this is
fine for now and someone can hopefully remember to clean that up later.

Trev

[Bug rtl-optimization/68194] [6 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu

2015-11-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68194

--- Comment #7 from Jeffrey A. Law  ---
*** Bug 68328 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/68328] [4.9/5/6 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu

2015-11-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68328

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||law at redhat dot com
 Resolution|--- |DUPLICATE

--- Comment #10 from Jeffrey A. Law  ---
Same issue as 68194.

*** This bug has been marked as a duplicate of bug 68194 ***

[Bug rtl-optimization/68185] [6 Regression] wrong code at -O3 on x86_64-linux-gnu (in 64-bit mode)

2015-11-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68185

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||law at redhat dot com
 Resolution|--- |DUPLICATE

--- Comment #4 from Jeffrey A. Law  ---
Same issue as 68194.

*** This bug has been marked as a duplicate of bug 68194 ***

Re: update zlib to 1.2.8

2015-11-23 Thread Matthias Klose


On 23.11.2015 19:13, Joel Brobecker wrote:

In GCC zlib is only used for libjava; for binutils and gdb it is used when
building without --with-system-zlib.  This just updates zlib from 1.2.7 to
1.2.8 (released in 2013).  Applies cleanly, libjava still builds and doesn't
show any regressions in the testsuite.  Ok to apply (even if we already are
in stage3)?



+2015-11-23  Matthias Klose  
+
+   * Imported zlib 1.2.8; merged local changes.


Should not be a problem for GDB, since we're not near branching time.

Out of curiosity, what prompted this update? Just to be in sync with
the latest? Or was there an actual bug that you hit which 1.2.8 fixes?


No, just a packaging issue with somebody mentioning a static binutils build. 
That's when I saw the outdated version.


Now updated in the GCC VCS.

Matthias

[Bug rtl-optimization/68194] [6 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu

2015-11-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68194

--- Comment #8 from Jeffrey A. Law  ---
*** Bug 68185 has been marked as a duplicate of this bug. ***

Re: basic asm and memory clobbers

2015-11-23 Thread Jeff Law


On 11/23/2015 03:04 AM, Andrew Haley wrote:

On 21/11/15 12:56, David Wohlferd wrote:

So, what now?

While I'd like to take the big step and start kicking out warnings for
non-top-level right now, that may be too bold for phase 3.  A more
modest step for v6 would just provide a way to find them (maybe
something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the
current behavior as well as the upcoming change.


Warnings would be good.

My warning still holds: there are modes of compilation on some
machines where you can't clobber all registers without causing reload
failures.  This is why Jeff didn't fix this in 1999.  So, if we really
do want to clobber "all" registers in basic asm it'll take a lot of
work.
Exactly.  In retrospect, I probably should have generated more tests for 
those conditions back in '99.  Essentially they'd document a class of 
problems we'd like to fix over time.


I know some have been addressed in various forms, but it hasn't been 
systematic.


My recommendation here is to:

  1. Note in the docs what the behaviour should be.  This guides where
  we want to go from an implementation standpoint.  I think it'd be fine
  to *suggest* only using old style asms at the toplevel, but I'm less
  convinced that mandating that restriction is wise.

  2. As we come across failures for adhere to the desired behaviour,
  fix or document them as known inconsistencies.  If we find that some
  are inherently un-fixable, then we'll need to tighten the docs around
  those.


The more I think about it, I'm just not keen on forcing all those 
old-style asms to change.


jeff

Re: [PATCH 1/2] Libsanitizer merge from upstream r253555.

2015-11-23 Thread Jakub Jelinek

On Mon, Nov 23, 2015 at 10:46:33AM +0300, Maxim Ostapenko wrote:
> Index: libsanitizer/configure.ac
> ===
> --- libsanitizer/configure.ac (revision 230597)
> +++ libsanitizer/configure.ac (working copy)
> @@ -136,6 +136,12 @@
>  esac
>  AM_CONDITIONAL(USING_MAC_INTERPOSE, $MAC_INTERPOSE)
>  
> +case "$target" in
> +  aarch64-*-linux*) tsan_aarch64=true ;;
> +  *) tsan_aarch64=false ;;
> +esac
> +AM_CONDITIONAL(TSAN_AARCH64, $tsan_aarch64)
> +

I don't understand the purpose of the above.

> Index: libsanitizer/configure.tgt
> ===
> --- libsanitizer/configure.tgt(revision 230597)
> +++ libsanitizer/configure.tgt(working copy)
> @@ -37,6 +37,8 @@
>aarch64*-*-linux*)
>   if test x$ac_cv_sizeof_void_p = x8; then
>   TSAN_SUPPORTED=yes
> + LSAN_SUPPORTED=yes
> + TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_aarch64.lo
>   fi
>   ;;
>x86_64-*-darwin[1]* | i?86-*-darwin[1]*)

You already have this.

> Index: libsanitizer/tsan/Makefile.am
> ===
> --- libsanitizer/tsan/Makefile.am (revision 230597)
> +++ libsanitizer/tsan/Makefile.am (working copy)
> @@ -21,6 +21,8 @@
>   tsan_interface_atomic.cc \
>   tsan_interface.cc \
>   tsan_interface_java.cc \
> + tsan_libdispatch_mac.cc \
> + tsan_malloc_mac.cc \
>   tsan_md5.cc \
>   tsan_mman.cc \
>   tsan_mutex.cc \
> @@ -28,6 +30,7 @@
>   tsan_new_delete.cc \
>   tsan_platform_linux.cc \
>   tsan_platform_mac.cc \
> + tsan_platform_posix.cc \
>   tsan_platform_windows.cc \
>   tsan_report.cc \
>   tsan_rtl.cc \
> @@ -41,7 +44,11 @@
>   tsan_sync.cc 
>  
>  libtsan_la_SOURCES = $(tsan_files)
> +if TSAN_AARCH64
> +EXTRA_libtsan_la_SOURCES = tsan_rtl_aarch64.S
> +else
>  EXTRA_libtsan_la_SOURCES = tsan_rtl_amd64.S
> +endif

And if I understand automake manual, you can list in there both
EXTRA_libtsan_la_SOURCES = tsan_rtl_amd64.S tsan_rtl_aarch64.S
unconditionally, and what object actually gets linked in is picked from the
$(TSAN_TARGET_DEPENDENT_OBJECTS) (and similarly dependencies).

Otherwise LGTM.

Jakub

[Bug c++/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread glisse at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #3 from Marc Glisse  ---
(In reply to Vladimir Sedach from comment #2)
> It is not just about "long long".

It isn't about long long at all, it is about whether your code is valid. In
your latest example, you are casting an int* to a float*, that's pretty much
the definition of an aliasing violation.

The types __m128 etc are documented as allowing aliasing, but I don't think
that extends to other operands of the intrinsics.

Re: [Patch] S/390: Fix symbol ref alignment

2015-11-23 Thread Andreas Krebbel

On 10/23/2015 02:12 PM, Robin Dapp wrote:
> gcc/testsuite/ChangeLog:
> 
> 2015-10-23  Robin Dapp  
> 
> * gcc.target/s390/load-relative-check.c: New test to check
> generation of load relative instructions.
> 
> 
> gcc/ChangeLog:
> 
> 2015-10-23  Robin Dapp  
> 
> * config/s390/s390.h: Add new symref flags, _NOTALIGN2 etc.
> * config/s390/s390.c (s390_check_symref_alignment): Use new
> symref flags, early abort on wrong alignment
> (s390_secondary_reload): Use new symref flags.
> (s390_encode_section_info): Likewise.
> * config/s390/predicates.md: Likewise.

Applied. Thanks!

-Andreas-

[Bug libstdc++/68479] Dynamic loading multiple shared libraries with identical static libstdc++ breaks streams

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68479

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-11-23
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  I don't think this is intended to work when you re-export the
libstdc++ symbols.  Not sure if there is an easy way to make them hidden
with --begin-group/--end-group and a linker flag.  So you'd need to provide
a version script to the link of x.so and y.so.

[Bug c++/68476] microblaze: compilation of btSoftBody.cpp doesn't terminate with optimisation

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68476

Richard Biener  changed:

   What|Removed |Added

 Target||microblaze
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2015-11-23
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Waiting for at least a preprocessed source file and the compile output with -v
appended.

[Bug libfortran/51119] MATMUL slow for large matrices

2015-11-23 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #21 from Thomas Koenig  ---

> Hidden behind a -fexternal-blas-n switch might be an option. Including GPUs
> seems even a tad more tricky. We have a paper on GPU (small) matrix
> multiplication, http://dbcsr.cp2k.org/_media/gpu_book_chapter_submitted.pdf

Quite interesting what can be done with GPUs...

> . BTW, another interesting project is the libxsmm library more aimed at
> small (<128) matrices see : https://github.com/hfp/libxsmm . Not sure if
> this info is useful in this context, but it might provide inspiration.

I assume that for  small matrices bordering on the silly
(say, a matrix multiplication with dimensions of (1,2) and (2,1))
the inline code will be faster if the code is compiled with the
right options, due to function call overhead.  I also assume that
libxsmm will become faster quite soon for bigger sizes.

Do you have an idea where the crossover is?

Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-23 Thread Richard Biener

On Fri, Nov 20, 2015 at 4:10 PM, Ilya Enkovich  wrote:
> On 20 Nov 14:31, Ilya Enkovich wrote:
>> 2015-11-20 14:28 GMT+03:00 Richard Biener :
>> > On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich  
>> > wrote:
>> >> 2015-11-18 16:44 GMT+03:00 Richard Biener :
>> >>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich  
>> >>> wrote:
>>  Hi,
>> 
>>  When we compute vectypes we skip non-relevant phi nodes.  But we 
>>  process non-relevant alive statements and thus may need vectype of 
>>  non-relevant live phi node to compute mask vectype.  This patch enables 
>>  vectype computation for live phi nodes.  Botostrapped and regtested on 
>>  x86_64-unknown-linux-gnu.  OK for trunk?
>> >>>
>> >>> Hmm.  What breaks if you instead skip all !relevant stmts and not
>> >>> compute vectype for life but not relevant ones?  We won't ever
>> >>> "vectorize" !relevant ones, that is, we don't need their vector type.
>> >>
>> >> I tried it and got regression in SLP.  It expected non-null vectype
>> >> for non-releveant but live statement. Regression was in
>> >> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90
>> >
>> > Because somebody put a vector type check before
>> >
>> >   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>> > return false;
>> >
>> > @@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
>> >tree mask_type;
>> >tree mask;
>> >
>> > +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>> > +return false;
>> > +
>> >if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>> >  return false;
>> >
>> > @@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
>> >  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>> >
>> >gcc_assert (ncopies >= 1);
>> > -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>> > -return false;
>> >
>> >if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
>> >&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>> >
>> > fixes this particular fallout for me.
>>
>> I'll try it.
>
> With this fix it works fine, thanks!  Bootstrapped and regtested on 
> x86_64-unknown-linux-gnu.  OK for trunk?

Ok.

Thanks,
Richard.

> Ilya
> --
> gcc/
>
> 2015-11-20  Ilya Enkovich  
> Richard Biener  
>
> * tree-vect-loop.c (vect_determine_vectorization_factor): Don't
> compute vectype for non-relevant mask producers.
> * gcc/tree-vect-stmts.c (vectorizable_comparison): Check stmt
> relevance earlier.
>
> gcc/testsuite/
>
> 2015-11-20  Ilya Enkovich  
>
> * gcc.dg/pr68327.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/pr68327.c b/gcc/testsuite/gcc.dg/pr68327.c
> new file mode 100644
> index 000..c3e6a94
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr68327.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +int a, d;
> +char b, c;
> +
> +void
> +fn1 ()
> +{
> +  int i = 0;
> +  for (; i < 1; i++)
> +d = 1;
> +  for (; b; b++)
> +a = 1 && (d & b);
> +}
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 80937ec..592372d 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -439,7 +439,8 @@ vect_determine_vectorization_factor (loop_vec_info 
> loop_vinfo)
>  compute a factor.  */
>   if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
> {
> - mask_producers.safe_push (stmt_info);
> + if (STMT_VINFO_RELEVANT_P (stmt_info))
> +   mask_producers.safe_push (stmt_info);
>   bool_result = true;
>
>   if (gimple_code (stmt) == GIMPLE_ASSIGN
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 0f64aaf..3723b26 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -7546,6 +7546,9 @@ vectorizable_comparison (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>tree mask_type;
>tree mask;
>
> +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> +return false;
> +
>if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>  return false;
>
> @@ -7558,9 +7561,6 @@ vectorizable_comparison (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>
>gcc_assert (ncopies >= 1);
> -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> -return false;
> -
>if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
>&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>&& reduc_def))

Re: update zlib to 1.2.8

2015-11-23 Thread Andrew Haley

On 23/11/15 04:37, Matthias Klose wrote:
> In GCC zlib is only used for libjava; for binutils and gdb it is used when 
> building without --with-system-zlib.  This just updates zlib from 1.2.7 to 
> 1.2.8 
> (released in 2013).  Applies cleanly, libjava still builds and doesn't show 
> any 
> regressions in the testsuite.  Ok to apply (even if we already are in stage3)?

Fine by me; GDB assent is more important.

Andrew.

Re: basic asm and memory clobbers

2015-11-23 Thread Joseph Myers

Note that basic asm is part of the standard C++ syntax.  "An asm 
declaration has the form
asm-definition:
asm ( string-literal ) ;
The asm declaration is conditionally-supported; its meaning is 
implementation-defined. [ Note: Typically
it is used to pass information through the implementation to an assembler. 
— end note ]" (7.4 [dcl.asm]).

-- 
Joseph S. Myers
jos...@codesourcery.com

[Bug tree-optimization/68327] [6 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vect_is_simple_use, at tree-vect-stmts.c:8562

2015-11-23 Thread ienkovich at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68327

Ilya Enkovich  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Ilya Enkovich  ---
Fixed

[Bug target/68483] [5/6 Regression] gcc 5.2: suboptimal code compared to 4.9

2015-11-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68483

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
On i?86 this regressed with r217509, aka part of VEC_RSHIFT_EXPR removal.
Guess we'll need to have a look at the i?86 vec perm handling.

[Bug target/68483] [5/6 Regression] gcc 5.2: suboptimal code compared to 4.9

2015-11-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68483

--- Comment #4 from Jakub Jelinek  ---
Ah, no, the problem is not on the backend side, but during veclower2 pass.
Before that pass we after the replacement of v>> 64 or v>>32 shifts we have:
  vect_sum_15.12_58 = VEC_PERM_EXPR ;
  vect_sum_15.12_59 = vect_sum_15.12_58 + vect_sum_15.10_57;
  vect_sum_15.12_60 = VEC_PERM_EXPR ;
  vect_sum_15.12_61 = vect_sum_15.12_60 + vect_sum_15.12_59;
but veclower2 for some reason decides to lower the latter VEC_PERM_EXPR into:
  _32 = BIT_FIELD_REF ;
  _17 = BIT_FIELD_REF ;
  _23 = BIT_FIELD_REF ;
  vect_sum_15.12_60 = {_32, _17, _23, 0};
The first VEC_PERM_EXPR is kept and generates efficient code.  If I manually
disable in the debugger the lowering, the code regression is gone.

[Bug c/63303] Pointer subtraction is broken when using -fsanitize=undefined

2015-11-23 Thread fw at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63303

--- Comment #14 from Florian Weimer  ---
(In reply to Szabolcs Nagy from comment #13)
> if gcc treats p-q as (ssize_t)p-(ssize_t)q and makes
> optimization decisions based on signed int range then
> that's broken and leads to wrong code gen.

Thanks for the test case.  I think the remedy proposed so far (glibc should
block allocations sized half of the address space and larger) is insufficient.

Re: [PATCH] Mark by_ref mem_ref in build_receiver_ref as non-trapping

2015-11-23 Thread Richard Biener

On Mon, Nov 23, 2015 at 9:45 AM, Jakub Jelinek  wrote:
> On Sat, Nov 21, 2015 at 07:34:02PM +0100, Tom de Vries wrote:
>> Mark by_ref mem_ref in build_receiver_ref as non-trapping
>>
>> 2015-11-21  Tom de Vries  
>>
>>   * omp-low.c (build_receiver_ref): Mark by_ref mem_ref as non-trapping.
>
> This is ok.

Are you sure this is properly re-set by inlining via

  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
 remapped a parameter as the property might be valid only
 for the parameter itself.  */
  if (TREE_THIS_NOTRAP (old)
  && (!is_parm (TREE_OPERAND (old, 0))
  || (!id->transform_parameter && is_parm (ptr
TREE_THIS_NOTRAP (*tp) = 1;

?  Or is this never hoistable to a place where TREE_THIS_NOTRAP is not true
even after inlining?  (I presume this is not directly a load via the
static chain pointer?)

>>
>> ---
>>  gcc/omp-low.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
>> index 830db75..78f2853 100644
>> --- a/gcc/omp-low.c
>> +++ b/gcc/omp-low.c
>> @@ -1249,7 +1249,10 @@ build_receiver_ref (tree var, bool by_ref, 
>> omp_context *ctx)
>>TREE_THIS_NOTRAP (x) = 1;
>>x = omp_build_component_ref (x, field);
>>if (by_ref)
>> -x = build_simple_mem_ref (x);
>> +{
>> +  x = build_simple_mem_ref (x);
>> +  TREE_THIS_NOTRAP (x) = 1;
>> +}
>>
>>return x;
>>  }
>
>
> Jakub

Re: [PATCH, gcc5 backport] Fix PR ipa/65908

2015-11-23 Thread Richard Biener

On Mon, Nov 23, 2015 at 10:21 AM, Martin Liška  wrote:
> Hi.
>
> At the end of last week, Richi asked me to back port aforementioned PR.
> The patch contains two parts: first one is the patch that was applied to trunk
> and the second one is a hunk that implements param_used_p (coming from 
> r222374).
>
> Patch can bootstrap and survives regression tests on x86_64-linux-gnu.
>
> Ready for 5 branch?

Ok.

Richard.

> Thanks,
> Martin

[Bug c/67999] Wrong optimization of pointer comparisons

2015-11-23 Thread fw at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999

Florian Weimer  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=63303

--- Comment #26 from Florian Weimer  ---
(In reply to Florian Weimer from comment #12)
> (In reply to Daniel Micay from comment #10)
> > (In reply to Florian Weimer from comment #7)
> > > If this is not a GCC bug and it is the responsibility of allocators not to
> > > produce huge objects, do we also have to make sure that no object crosses
> > > the boundary between 0x7fff_ and 0x8000_?  If pointers are treated
> > > as de-facto signed, this is where signed overflow would occur.
> > 
> > No, that's fine.
> 
> Is this based on your reading of the standard, the GCC sources, or both? 
> (It is unusual to see people making such definite statements about
> middle-end/back-end behavior, that's why I have to ask.)

As I suspect, the claim that this is fine seems to be incorrect, see bug 63303
comment 13.

Re: [PATCH] Mark by_ref mem_ref in build_receiver_ref as non-trapping

2015-11-23 Thread Jakub Jelinek

On Mon, Nov 23, 2015 at 11:39:26AM +0100, Richard Biener wrote:
> On Mon, Nov 23, 2015 at 9:45 AM, Jakub Jelinek  wrote:
> > On Sat, Nov 21, 2015 at 07:34:02PM +0100, Tom de Vries wrote:
> >> Mark by_ref mem_ref in build_receiver_ref as non-trapping
> >>
> >> 2015-11-21  Tom de Vries  
> >>
> >>   * omp-low.c (build_receiver_ref): Mark by_ref mem_ref as 
> >> non-trapping.
> >
> > This is ok.
> 
> Are you sure this is properly re-set by inlining via
> 
>   /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
>  remapped a parameter as the property might be valid only
>  for the parameter itself.  */
>   if (TREE_THIS_NOTRAP (old)
>   && (!is_parm (TREE_OPERAND (old, 0))
>   || (!id->transform_parameter && is_parm (ptr
> TREE_THIS_NOTRAP (*tp) = 1;
> 
> ?  Or is this never hoistable to a place where TREE_THIS_NOTRAP is not true
> even after inlining?  (I presume this is not directly a load via the
> static chain pointer?)

I don't think inlining is ever around here, this is inside of the outlined
bodies of the OpenMP constructs, those are the *.omp_fn* artificial
functions called from libgomp, and is used in cases where
  .omp_data_i->field
is not the field itself, but pointer to the original variable.  The caller
of the libgomp functions that in the end invoke the .omp_fn* functions
guarantees that the field in that case is initialized to an address of the
original variables, is not NULL or some invalid pointer.

Jakub

[Ada] Introduce a Frontend_Exceptions flag in system.ads

2015-11-23 Thread Olivier Hainque

Hello,

The Ada compiler supports different sorts of exception schemes today. The two
most commonly used are what we commonly refer to as "frontend-sjlj", and
"gcc-zcx". The former is entirely managed by the front-end (gigi included),
relying on builtin_setjmp / builtin_longjmp pairs. The latter exposes the
exception related constructs to the middle-end, most often configured for
table based unwinding. We refer to table based schemes as "zero cost" with
respect to what happens in absence of propagation, hence the "zcx" abbrev
to denote "zero cost exceptions".

We can configure compilers to use the sjlj eh model as well but the front-end
internals aren't really prepared for this and this leads to bugs in some
circumstances.

The frontend perception of the EH scheme in use is currently controlled by the
"ZCX_By_Default" flag in system.ads. Very roughly, True is taken to denote
"gcc-zcx" and False conveys "frontend-sjlj". To allow proper support of
"gcc-sjlj", we introduce a "Frontend_Exceptions" flag and adjust the compiler
and runtimes accordingly.

This patch contains the front-end + Makefile part and a couple of adjustments
to the "tools", compensating for changes that were done preventively before,
when we had a different scheme in mind.

The compiler part involves a few general steps:
 - Adjust the possible values of the Exception_Scheme variable (opt)
 - Reflect this update in fe.h for gigi's consumption
 - Handle the new flag throughout (targparm, lib-writ, ali, bcheck, gnat1drv)
 - Adjust the abort_defer/abort_undefer call expansions to
   trigger for ZCX instead of back-end eh (exp_ch9, exp_ch11, exp_sel).
 - Adjust gigi to use the new mechanism values and helpers.
 
Then all the system.ads files will be updated with a correct value of the
Frontend_Exceptions flags.

Bootstrapped and regression tested on x86_64-linux-gnu. Committing to trunk.

Olivier

* opt.ads (Exception_Mechanism): Now three values: Front_End_SJLJ,
Back_End_SJLJ and Back_End_ZCX.
(Back_End_Exceptions, Front_End_Exceptions, ZCX_Exceptions,
SJLJ_Exceptions): New functions, reflecting properties of the current
Exception_Mechanism.
* opt.adb: Implement the new functions.
* fe.h: Bind the new Exception_Mechanism and helper functions for gigi.

* exp_ch11.adb (Expand_At_End_Handler): Replace test on mechanism by
use of property helper and update comments.
(Expand_Exception_Handlers): Replace tests on mechanism by use of
helper. Restrict Abort_Defer to ZCX specifically.
* exp_ch9.adb (Expand_N_Asynchronous_Select): Replace tests on
mechanism by calls to helper functions. Abort_Undefer for ZCX only,
paired with Expand_Exception_Handlers.
* exp_sel.adb (Build_Abort_Block_Handler): Replace tests on mechanism
by calls to helper functions. Abort_Undefer for ZCX only, paired with
Expand_Exception_Handlers.

* lib-writ.ads (P line documentation): Add entry for "FX",
representative of unit compiled with Frontend_Exceptions True.
* lib-writ.adb (Output_Main_Program_Line): Add "FX" on P line if
compiled with Frontend_Exceptions True.

* ali.ads (ALIs_Record): Ada a Frontend_Exceptions component, to reflect
whether the ALI file contained an "FX" indication on the P line.
(Frontend_Exceptions_Specified): New boolean, to keep track of whether
at least an FX ALI file is in the closure.
* ali.adb (Scan_ALI): Handle "FX" on the P line.
(Initialize_ALI): Initialize Frontend_Exceptions_Specified to False.

* targparm.ads: Update desription of exception schemes.
(Frontend_Exceptions_On_Target): New flag, reflect Frontend_Exceptions
set to True in system.ads, or not set at all.
* targparm.adb (Targparm_Tags): Add FEX to convey Frontend_Exceptions.
Rename ZCD to ZCX for consistency.
(FEX_Str, Targparm_Str, Get_Target_Parameters): Adjust accordingly.

* gnat1drv.adb (Adjust_Global_Switches): Adjust Exception_Mechanism
setting, now from combination of Frontend_Exceptions and ZCX_By_Default.

* bcheck.adb (Check_Consistent_Zero_Cost_Exception_Handling): Rename
as ...
(Check_Consistent_Exception_Handling): Check consistency of both
ZCX_By_Default and Frontend_Exceptions.
(Check_Configuration_Consistency): Check_Consistent_Exception_Handling
if either flag was set at least once.

* make.adb (Check): Remove processing of a possible -fsjlj coming from
lang-specs.h.
* gnatlink.adb (Gnatlin): Likewise.

gcc-interface/

* decl.c (gnat_to_gnu_entity, case E_Variable): Use eh property helper
to test for back-end exceptions. Adjust mechanism name when testing for
front-end sjlj.
(case E_Procedure): Likewise.
* trans.c

[Bug rtl-optimization/68128] A huge regression in Parboil v2.5 OpenMP CUTCP test (2.5 times lower performance)

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68128

--- Comment #6 from Richard Biener  ---
.omp_data_i = _NOALIAS.0+64
PARM_NOALIAS.0+64 = 
PARM_NOALIAS.64+192 = 
...
_35 = *.omp_data_i
pg_36 = _35 + UNKNOWN
pg_63 = pg_36

.omp_data_i_12(D), points-to vars: { D.1985 } (nonlocal)
pg_63 = { NONLOCAL }

and we end up with

  :
  # iftmp.0_7 = PHI 
  _47 = iftmp.0_7 + _46;
  *pg_63 = _47;
  i_49 = i_62 + 1;
  # PT = nonlocal
  pg_50 = pg_63 + 4;
  _51 = MEM[(struct .omp_data_s.1 &).omp_data_i_12(D) clique 1 base
1].gridspacing;
  dx_52 = _51 + dx_64;
  if (ib_28 >= i_49)
goto ;
  else
goto ;

where we consider the load of gridspacing to alias *pg_63.  That is because
of the not implemented ??? in

/* Mark "other" loads and stores as belonging to CLIQUE and with
   base zero.  */

static bool
visit_loadstore (gimple *, tree base, tree ref, void *clique_)
{
  unsigned short clique = (uintptr_t)clique_;
  if (TREE_CODE (base) == MEM_REF
  || TREE_CODE (base) == TARGET_MEM_REF)
{
  tree ptr = TREE_OPERAND (base, 0);
  if (TREE_CODE (ptr) == SSA_NAME
  && ! SSA_NAME_IS_DEFAULT_DEF (ptr))
{
  /* ???  We need to make sure 'ptr' doesn't include any of
 the restrict tags we added bases for in its points-to set.  */
  return false;
}

which would need to look at ptr's points-to solution and intersect that
with a bitmap we'd need to form out of the restrict tags used for the
respective clique (we only use a single one at the moment, thus a
single bit test is enough if you consider properly pt_anything for ptr).

It's not a complicated fix I think so if you have time to play with it...

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-11-23 Thread Richard Biener

On Fri, 20 Nov 2015, Ilya Verbin wrote:

> On Wed, Dec 10, 2014 at 01:48:21 +0300, Ilya Verbin wrote:
> > On 09 Dec 14:59, Richard Biener wrote:
> > > On Mon, 8 Dec 2014, Ilya Verbin wrote:
> > > > Unfortunately, this fix was not general enough.
> > > > There might be cases when mixed object files get into lto-wrapper, ie 
> > > > some of
> > > > them contain only LTO sections, some contain only offload sections, and 
> > > > some
> > > > contain both.  But when lto-wrapper will pass all these files to 
> > > > recompilation,
> > > > the compiler might crash (it depends on the order of input files), 
> > > > since in
> > > > read_cgraph_and_symbols it expects that *all* input files contain IR 
> > > > section of
> > > > given type.
> > > > This patch splits input objects from argv into lto_argv and 
> > > > offload_argv, so
> > > > that all files in arrays contain corresponding IR.
> > > > Similarly, in lto-plugin, it was bad idea to add objects, which contain 
> > > > offload
> > > > IR without LTO, to claimed_files, since this may corrupt a resolution 
> > > > file.
> > > > 
> > > > Tested on various combinations of files with/without -flto and 
> > > > with/without
> > > > offload, using trunk ld and gold, also tested on ld without plugin 
> > > > support.
> > > > Bootstrap and make check passed on x86_64-linux and i686-linux.  Ok for 
> > > > trunk?
> > > 
> > > Did you check that bootstrap-lto still works?  Ok if so.
> > 
> > Yes, bootstrap-lto passed.
> > Committed revision 218543.
> 
> I don't know how I missed this a year ago, but mixing of LTO objects with
> offloading-without-LTO objects still doesn't work :(
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 filed about that.
> Any thoughts how to fix this?

Don't claim files you don't handle.

Richard.

[Bug target/68497] New: ICE: in output_387_binary_op, at config/i386/i386.c:17689 with -fno-checking

2015-11-23 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68497

Bug ID: 68497
   Summary: ICE: in output_387_binary_op, at
config/i386/i386.c:17689 with -fno-checking
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---

Created attachment 36806
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36806=edit
reduced testcase

Compiler output:
$ gcc -fno-checking testcase.c 
testcase.c: In function 'foo':
testcase.c:5:1: internal compiler error: in output_387_binary_op, at
config/i386/i386.c:17689
 }
 ^

0xecd4d6 output_387_binary_op(rtx_def*, rtx_def**)
/repo/gcc-trunk/gcc/config/i386/i386.c:17689
0x85693b final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/repo/gcc-trunk/gcc/final.c:2947
0x8584c2 final(rtx_insn*, _IO_FILE*, int)
/repo/gcc-trunk/gcc/final.c:2044
0x858e69 rest_of_handle_final
/repo/gcc-trunk/gcc/final.c:4435
0x858e69 execute
/repo/gcc-trunk/gcc/final.c:4510
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

$ gcc -v   
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-230738-checking-yes-rtl-df-nographite/bin/../libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-checking=yes,rtl,df --without-cloog --without-ppl --without-isl
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-230738-checking-yes-rtl-df-nographite
Thread model: posix
gcc version 6.0.0 20151123 (experimental) (GCC) 

The failing assertion is:
17686:  && (STACK_TOP_P (operands[1]) || STACK_TOP_P (operands[2])))
17687:; /* ok */
17688:  else
17689:gcc_checking_assert (is_sse);
17690:
17691:  switch (GET_CODE (operands[3]))
17692:{


The above if(flag_checking ... ) is false for -fno-checking, but
gcc_checking_assert() doesn't care about -fno-checking.


Tested revisions:
r230738 - ICE

[gomp4] Adjust Fortran OACC async lib test

2015-11-23 Thread Chung-Lin Tang

Hi Thomas,
this fix adds more acc_wait's to libgomp.oacc-fortran/lib-1[13].f90.

For lib-12.f90, it's sort of a fix before we can resolve the issue
of intended semantics for "wait+async".

As for lib-13.f90, I believe these added acc_wait calls seem
reasonable, since we can't immediately assume the async-launched parallels
already completed there.

Does this seem reasonable?

Thanks,
Chung-Lin

* testsuite/libgomp.oacc-fortran/lib-12.f90 (main): Add acc_wait()
after async parallel construct.
* testsuite/libgomp.oacc-fortran/lib-13.f90 (main): Add acc_wait()
calls after parallel construct launches.
Index: libgomp.oacc-fortran/lib-12.f90
===
--- libgomp.oacc-fortran/lib-12.f90 (revision 230719)
+++ libgomp.oacc-fortran/lib-12.f90 (working copy)
@@ -15,6 +15,8 @@ program main
 end do
   !$acc end parallel
 
+  call acc_wait (0)
+
   call acc_wait_async (0, 1)
 
   if (acc_async_test (0) .neqv. .TRUE.) call abort
Index: libgomp.oacc-fortran/lib-13.f90
===
--- libgomp.oacc-fortran/lib-13.f90 (revision 230719)
+++ libgomp.oacc-fortran/lib-13.f90 (working copy)
@@ -21,6 +21,9 @@ program main
 end do
   !$acc end data
 
+  call acc_wait (1)
+  call acc_wait (2)
+
   if (acc_async_test (1) .neqv. .TRUE.) call abort
   if (acc_async_test (2) .neqv. .TRUE.) call abort

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Arnaud Charlet

> > So there is indeed no point in trying to fix one or two cases, and we should
> > instead instruct LTO somehow to treat System.Address is compatible with
> > void* otherwise we'll run into endless troubles on that since using
> > System.Address as void* is very common practice in Ada code.
> 
> Maybe we could apply this special treatment only to the void_ptr subtype of
> Interfaces.C.Extensions and require its use when interfacing with C.

No, Interfaces.C.Extensions is non portable, so almost no Ada code out there
is using it. As I said, existing Ada code is using System.Address all the time,
so requiring any code change in this area is just a non starter. We'd
rather require that people don't use LTO with Ada rather than tell them to
use Interfaces.C.Extensions, that would be more constructive :-)

Arno

Re: [PATCH] Don't reapply loops flags if unnecessary in loop_optimizer_init

2015-11-23 Thread Tom de Vries


On 23/11/15 11:29, Richard Biener wrote:

On Mon, 23 Nov 2015, Tom de Vries wrote:


[ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ]

On 20/11/15 11:37, Richard Biener wrote:

I'd rather make loop_optimizer_init do nothing
if requested flags are already set and no fixup is needed and
call the above unconditionally.  Thus sth like

Index: gcc/loop-init.c
===
--- gcc/loop-init.c (revision 230649)
+++ gcc/loop-init.c (working copy)
@@ -103,7 +103,11 @@ loop_optimizer_init (unsigned flags)
 calculate_dominance_info (CDI_DOMINATORS);

 if (!needs_fixup)
-   checking_verify_loop_structure ();
+   {
+ checking_verify_loop_structure ();
+ if (loops_state_satisfies_p (flags))
+   goto out;
+   }

 /* Clear all flags.  */
 if (recorded_exits)
@@ -122,11 +126,12 @@ loop_optimizer_init (unsigned flags)
 /* Apply flags to loops.  */
 apply_loop_flags (flags);

+  checking_verify_loop_structure ();
+
+out:
 /* Dump loops.  */
 flow_loops_dump (dump_file, NULL, 1);

-  checking_verify_loop_structure ();
-
 timevar_pop (TV_LOOP_INIT);
   }


This patch implements that approach, but the patch is slightly more
complicated because of the need to handle LOOPS_MAY_HAVE_MULTIPLE_LATCHES
differently than the rest of the flags.

Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?


Let's revisit this during stage1 if the scev_initialized () thing
SLP vectorization uses works, ok?



OK, I'll give that a try.

FTR, attached two patches are an attempt at a cleaner solution for 
pass_slp_vectorize::execute (in combination with patch "Don't reapply 
loops flags if unnecessary in loop_optimizer_init").


The first patch introduces a property PROP_scev, set for the duration of 
the loop pipeline. It allows us to call scev_initialize and 
scev_finalize unconditionally. Outside the loop pipeline calling the 
functions has the usual effect. Inside the loop pipeline, calling the 
functions has no effect.


The second patch introduces a property PROP_loops_normal_re_lcssa, set 
for the duration of the loop pipeline. It allows us (in combination with 
"Don't reapply loops flags if unnecessary in loop_optimizer_init") to 
call loop_optimizer_init and loop_optimizer_finalize unconditionally.
Outside the loop pipeline, calling the functions has the usual effect. 
Inside the loop pipeline, calling loop_optimizer_finalize has no effect, 
and calling loop_optimizer_initialize has no effect unless a fixup or a 
new loop property is needed.


Thanks,
- Tom

Add PROP_scev

---
 gcc/tree-pass.h |  1 +
 gcc/tree-scalar-evolution.c | 13 +
 gcc/tree-ssa-loop.c |  3 ++-
 gcc/tree-vectorizer.c   |  4 ++--
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 004db77..4e66b2c 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -227,6 +227,7 @@ protected:
 		   of math functions; the
 		   current choices have
 		   been optimized.  */
+#define PROP_scev		(1 << 16)	/* preserve scev info.  */
 
 #define PROP_trees \
   (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | PROP_gimple_lomp)
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 9b33693..5d5e354 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -280,6 +280,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-ssa-propagate.h"
 #include "gimple-fold.h"
+#include "tree-pass.h"
 
 static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
 static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
@@ -3168,6 +3169,12 @@ scev_initialize (void)
 {
   struct loop *loop;
 
+  if (cfun->curr_properties & PROP_scev)
+{
+  gcc_assert (scev_initialized_p ());
+  return;
+}
+
   scalar_evolution_info = hash_table::create_ggc (100);
 
   initialize_scalar_evolutions_analyzer ();
@@ -3367,6 +3374,12 @@ simple_iv (struct loop *wrto_loop, struct loop *use_loop, tree op,
 void
 scev_finalize (void)
 {
+  if (cfun->curr_properties & PROP_scev)
+{
+  gcc_assert (scev_initialized_p ());
+  return;
+}
+
   if (!scalar_evolution_info)
 return;
   scalar_evolution_info->empty ();
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index d30e3c8..739fda7 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -290,7 +290,7 @@ const pass_data pass_data_tree_loop_init =
   OPTGROUP_LOOP, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_cfg, /* properties_required */
-  0, /* properties_provided */
+  PROP_scev, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
   0, /* todo_flags_finish */
@@ -524,6 +524,7 @@ make_pass_iv_optimize (gcc::context *ctxt)
 static unsigned int
 tree_ssa_loop_done (void)
 {
+  cfun->curr_properties &=

Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-23 Thread Richard Biener

On Sat, 21 Nov 2015, Tom de Vries wrote:

> On 20/11/15 11:28, Richard Biener wrote:
> > On Thu, 19 Nov 2015, Tom de Vries wrote:
> > 
> > > >On 17/11/15 15:53, Tom de Vries wrote:
> > > > > > > >And the above LIM example
> > > > > > > >is none for why you need two LIM passes...
> > > > > >
> > > > > >Indeed. I'm planning a separate reply to explain in more detail the
> > > > need
> > > > > >for the two pass_lims.
> > > >
> > > >I.
> > > >
> > > >I managed to get rid of the two pass_lims for the motivating example that
> > > I
> > > >used until now (goacc/kernels-double-reduction.c). I found that by adding
> > > a
> > > >pass_dominator instance after pass_ch, I could get rid of the second
> > > pass_lim
> > > >(and pass_copyprop as well).
> > > >
> > > >But... then I wrote a counter example
> > > (goacc/kernels-double-reduction-n.c),
> > > >and I'm back at two pass_lims (and two pass_dominators).
> > > >Also I've split the pass group into a bit before and after pass_fre.
> > > >
> > > >So, the current pass group looks like:
> > > >...
> > > >NEXT_PASS (pass_build_ealias);
> > > >
> > > >/* Pass group that runs when the function is an offloaded function
> > > >containing oacc kernels loops.  Part 1.  */
> > > >NEXT_PASS (pass_oacc_kernels);
> > > >PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
> > > > /* We need pass_ch here, because pass_lim has no effect on
> > > >exit-first loops (PR65442).  Ideally we want to remove both
> > > >this pass instantiation, and the reverse transformation
> > > >transform_to_exit_first_loop_alt, which is done in
> > > >pass_parallelize_loops_oacc_kernels. */
> > > > NEXT_PASS (pass_ch);
> > > >POP_INSERT_PASSES ()
> > > >
> > > >NEXT_PASS (pass_fre);
> > > >
> > > >/* Pass group that runs when the function is an offloaded function
> > > >containing oacc kernels loops.  Part 2.  */
> > > >NEXT_PASS (pass_oacc_kernels2);
> > > >PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
> > > > /* We use pass_lim to rewrite in-memory iteration and reduction
> > > >variable accesses in loops into local variables accesses.  */
> > > > NEXT_PASS (pass_lim);
> > > > NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
> > > > NEXT_PASS (pass_lim);
> > > > NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
> > > > NEXT_PASS (pass_dce);
> > > > NEXT_PASS (pass_parallelize_loops_oacc_kernels);
> > > > NEXT_PASS (pass_expand_omp_ssa);
> > > >POP_INSERT_PASSES ()
> > > >NEXT_PASS (pass_merge_phi);
> > > >...
> > > >
> > > >
> > > >II.
> > > >
> > > >The motivating test-case kernels-double-reduction-n.c:
> > > >...
> > > >#include 
> > > >
> > > >#define N 500
> > > >
> > > >unsigned int a[N][N];
> > > >
> > > >void  __attribute__((noinline,noclone))
> > > >foo (unsigned int n)
> > > >{
> > > >   int i, j;
> > > >   unsigned int sum = 1;
> > > >
> > > >#pragma acc kernels copyin (a[0:n]) copy (sum)
> > > >   {
> > > > for (i = 0; i < n; ++i)
> > > >   for (j = 0; j < n; ++j)
> > > > sum += a[i][j];
> > > >   }
> > > >
> > > >   if (sum != 5001)
> > > > abort ();
> > > >}
> > > >...
> > > >
> > > >
> > > >III.
> > > >
> > > >Before first pass_lim. Note no phis on inner or outer loop header for
> > > >iteration varables or reduction variable:
> > > >...
> > > >   :
> > > >   _5 = *.omp_data_i_4(D).i;
> > > >   *_5 = 0;
> > > >   _44 = *.omp_data_i_4(D).n;
> > > >   _45 = *_44;
> > > >   if (_45 != 0)
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   : outer loop header
> > > >   _12 = *.omp_data_i_4(D).j;
> > > >   *_12 = 0;
> > > >   if (_45 != 0)
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   : inner loop header, latch
> > > >   _19 = *.omp_data_i_4(D).a;
> > > >   _21 = *_5;
> > > >   _23 = *_12;
> > > >   _24 = *_19[_21][_23];
> > > >   _25 = *.omp_data_i_4(D).sum;
> > > >   sum.0_26 = *_25;
> > > >   sum.1_27 = _24 + sum.0_26;
> > > >   *_25 = sum.1_27;
> > > >   _33 = _23 + 1;
> > > >   *_12 = _33;
> > > >   j.2_16 = (unsigned int) _33;
> > > >   if (j.2_16 < _45)
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   : outer loop latch
> > > >   _36 = *_5;
> > > >   _38 = _36 + 1;
> > > >   *_5 = _38;
> > > >   i.3_9 = (unsigned int) _38;
> > > >   if (i.3_9 < _45)
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   :
> > > >   return;
> > > >...
> > > >
> > > >
> > > >IV.
> > > >
> > > >After first pass_lim/pass_dom pair. Note there are phis on the inner loop
> > > >header for the reduction and the iteration variable, but not on the outer
> > > loop
> > > >header:
> > > >...
> > > >   :
> > > >   _5 = *.omp_data_i_4(D).i;
> > > >   *_5 = 0;
> > > >   _44 = *.omp_data_i_4(D).n;
> > > >   _45 = *_44;
> > > >   if (_45 != 0)
> > > > goto ;
> > > >   else
> > > > goto ;
> > > >
> > > >   :
> > > >   _12 = *.omp_data_i_4(D).j;
> > > >   _19 = *.omp_data_i_4(D).a;
> > > >

Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-11-23 Thread Thomas Schwinge

Hi Jakub!

On Fri, 17 Jul 2015 15:05:59 +0200, Jakub Jelinek  wrote:
> [...] "omp declare target link" [...]

> This patch only marks them with the new attribute, [...]

> --- gcc/c/c-parser.c.jj   2015-07-16 18:09:25.0 +0200
> +++ gcc/c/c-parser.c  2015-07-17 14:11:08.553694975 +0200

>  static void
>  c_parser_omp_declare_target (c_parser *parser)
>  {
> [...]
> +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +{
> +  tree t = OMP_CLAUSE_DECL (c), id;
> +  tree at1 = lookup_attribute ("omp declare target", DECL_ATTRIBUTES 
> (t));
> +  tree at2 = lookup_attribute ("omp declare target link",
> +DECL_ATTRIBUTES (t));
> +  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINK)
> + {
> +   id = get_identifier ("omp declare target link");
> +   std::swap (at1, at2);
> + }
> +  else
> + id = get_identifier ("omp declare target");

Is it intentional that you didn't add "omp declare target link" to
gcc/c-family/c-common.c:c_common_attribute_table, next to the existing
"omp declare target"?


Grüße
 Thomas


signature.asc
Description: PGP signature

Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Ilya Enkovich

On 23 Nov 11:44, Richard Biener wrote:
> On Mon, Nov 23, 2015 at 11:10 AM, Ilya Enkovich  
> wrote:
> > On 23 Nov 10:39, Richard Biener wrote:
> >> On Fri, Nov 20, 2015 at 3:30 PM, Ilya Enkovich  
> >> wrote:
> >> > On 20 Nov 14:54, Richard Biener wrote:
> >> >>
> >> >> I don't think you can in any way rely on the pointer type of the src 
> >> >> argument
> >> >> as all pointer conversions are useless and memcpy and friends take void 
> >> >> *
> >> >> anyway.
> >> >
> >> > This check is looking for cases when we have type information indicating
> >> > no pointers are copied.  In case of 'void *' we have to assume pointers
> >> > are copied and inlining is undesired.  Test pr68337-2.c checks pointer
> >> > type allows to enable inlining.  Looks like this check misses
> >> > || !COMPLETE_TYPE_P(TREE_TYPE (TREE_TYPE (src)))?
> >>
> >> As said there is no information in the pointer / pointed-to type in GIMPLE.
> >
> > What does it mean?  We do have TREE_TYPE for used pointer and nested 
> > TREE_TYPE
> > holding pointed-to type.  Is it some random invalid type?
> 
> Yes, it can be a "random" type.  Like for
> 
> void foo (float *f)
> {
>   memcpy ((void *)f, ...);
> }
> int main()
> {
>   int **a[10];
>   foo (a);
> }
> 
> which tries to copy to an array of int * but the GIMPLE IL for foo
> will call memcpy with a float * typed argument.

I see.  But it should still be OK to check type in case of strict aliasing, 
right?

Thanks,
Ilya

> 
> >>
> >> >>
> >> >> Note that you also disable memmove to memcpy simplification with this
> >> >> early check.
> >> >
> >> > Doesn't matter for MPX which uses the same implementation for both cases.
> >> >
> >> >>
> >> >> Where is pointer transfer handled for MPX?  I suppose it's not done
> >> >> transparently
> >> >> for all memory move instructions but explicitely by instrumented block 
> >> >> copy
> >> >> routines in libmpx?  In which case how does that identify pointers vs.
> >> >> non-pointers?
> >> >
> >> > It is handled by instrumentation pass.  Compiler checks type of stored 
> >> > data to
> >> > find pointer stores.  Each pointer store is instrumented with bndstx 
> >> > call.
> >>
> >> How does it identify "pointer store"?  With -fno-strict-aliasing you can 
> >> store
> >> pointers using an integer type.  You can also always store pointers using
> >> a character type like
> >>
> >> void foo (int *p, int **dest)
> >> {
> >>   ((char *)*dest)[0] = (((char *))[0];
> >>   ((char *)*dest)[1] = (((char *))[1];
> >>   ((char *)*dest)[2] = (((char *))[2];
> >>   ((char *)*dest)[3] = (((char *))[3];
> >> }
> >
> > Pointer store is identified using type information.  When pointer is casted 
> > to
> > a non-pointer type its bounds are lost.
> >
> > Ilya
> >
> >>
> >> > MPX versions of memcpy, memmove etc. don't make any assumptions about
> >> > type of copied data and just copy whole chunk of bounds metadata 
> >> > corresponding
> >> > to copied block.
> >>
> >> So it handles copying a pointer in two pieces with two memcpy calls
> >> correctly.  Good.
> >>
> >> Richard.
> >>
> >> > Thanks,
> >> > Ilya
> >> >
> >> >>
> >> >> Richard.
> >> >>

Re: [PATCH][AArch64][v2] Improve comparison with complex immediates followed by branch/cset

2015-11-23 Thread Kyrill Tkachov



On 12/11/15 12:05, James Greenhalgh wrote:

On Tue, Nov 03, 2015 at 03:43:24PM +, Kyrill Tkachov wrote:

Hi all,

Bootstrapped and tested on aarch64.

Ok for trunk?

Comments in-line.



Here's an updated patch according to your comments.
Sorry it took so long to respin it, had other things to deal with with
stage1 closing...

I've indented the sample code sequences and used valid mnemonics.
These patterns can only match during combine, so I'd expect them to always
split during combine or immediately after, but I don't think that's a documented
guarantee so I've gated them on !reload_completed.

I've used IN_RANGE in the predicate.md hunk and added scan-assembler checks
in the tests.

Is this ok?

Thanks,
Kyrill

2015-11-20  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*condjump): Rename to...
(condjump): ... This.
(*compare_condjump): New define_insn_and_split.
(*compare_cstore_insn): Likewise.
(*cstore_insn): Rename to...
(cstore_insn): ... This.
* config/aarch64/iterators.md (CMP): Handle ne code.
* config/aarch64/predicates.md (aarch64_imm24): New predicate.

2015-11-20  Kyrylo Tkachov  

* gcc.target/aarch64/cmpimm_branch_1.c: New test.
* gcc.target/aarch64/cmpimm_cset_1.c: Likewise.


Thanks,
Kyrill


2015-11-03  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (*condjump): Rename to...
 (condjump): ... This.
 (*compare_condjump): New define_insn_and_split.
 (*compare_cstore_insn): Likewise.
 (*cstore_insn): Rename to...
 (aarch64_cstore): ... This.
 * config/aarch64/iterators.md (CMP): Handle ne code.
 * config/aarch64/predicates.md (aarch64_imm24): New predicate.

2015-11-03  Kyrylo Tkachov  

 * gcc.target/aarch64/cmpimm_branch_1.c: New test.
 * gcc.target/aarch64/cmpimm_cset_1.c: Likewise.
commit 7df013a391532f39932b80c902e3b4bbd841710f
Author: Kyrylo Tkachov 
Date:   Mon Sep 21 10:56:47 2015 +0100

 [AArch64] Improve comparison with complex immediates

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 126c9c2..1bfc870 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -369,7 +369,7 @@ (define_expand "mod3"
}
  )
  
-(define_insn "*condjump"

+(define_insn "condjump"
[(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register" "") (const_int 0)])
   (label_ref (match_operand 2 "" ""))
@@ -394,6 +394,40 @@ (define_insn "*condjump"
  (const_int 1)))]
  )
  
+;; For a 24-bit immediate CST we can optimize the compare for equality

+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; b .Label

This would be easier on the eyes if you were to indent the code sequence.

+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; b .Label
+;; into the shorter:
+;; sub x0, #(CST & 0xfff000)


+;; into the shorter:
+;; sub x0, #(CST & 0xfff000)
+;; subsx0, #(CST & 0x000fff)

These instructions are not valid (2 operand sub/subs?) can you write them
out fully for this comment so I can see the data flow?


+;; b .Label
+(define_insn_and_split "*compare_condjump"
+  [(set (pc) (if_then_else (EQL
+ (match_operand:GPI 0 "register_operand" "r")
+ (match_operand:GPI 1 "aarch64_imm24" "n"))
+  (label_ref:P (match_operand 2 "" ""))
+  (pc)))]
+  "!aarch64_move_imm (INTVAL (operands[1]), mode)
+   && !aarch64_plus_operand (operands[1], mode)"
+  "#"
+  "&& true"
+  [(const_int 0)]
+  {
+HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff;
+HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000;
+rtx tmp = gen_reg_rtx (mode);

Can you guarantee we can always create this pseudo? What if we're a
post-register-allocation split?


+emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm)));
+emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm)));
+rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM);
+rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx);
+emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2]));
+DONE;
+  }
+)
+
  (define_expand "casesi"
[(match_operand:SI 0 "register_operand" "") ; Index
 (match_operand:SI 1 "const_int_operand" ""); Lower bound
@@ -2898,7 +2932,7 @@ (define_expand "cstore4"
"
  )
  
-(define_insn "*cstore_insn"

+(define_insn "aarch64_cstore"
[(set (match_operand:ALLI 0 "register_operand" "=r")
(match_operator:ALLI 1 "aarch64_comparison_operator"
 [(match_operand 2 "cc_register" "") (const_int 0)]))]
@@ -2907,6 +2941,39

[Bug tree-optimization/68455] [6 Regression] ICE: tree check: expected integer_cst, have plus_expr in decompose, at tree.h:5123

2015-11-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68455

Marek Polacek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org

--- Comment #3 from Marek Polacek  ---
I'll have a look then.

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Richard Biener

On Mon, 23 Nov 2015, Jan Hubicka wrote:

> Hi,
> here is updated patch which I finally comitted today.  It addresses all the 
> comments
> and also fixes one nasty bug that really cost me a lot of time to understand. 
> 
> +   /* LTO type merging does not make any difference between 
> +  component pointer types.  We may have
> +
> +  struct foo {int *a;};
> +
> +  as TYPE_CANONICAL of 
> +
> +  struct bar {float *a;};
> +
> +  Because accesses to int * and float * do not alias, we would get
> +  false negative when accessing the same memory location by
> +  float ** and bar *. We thus record the canonical type as:
> +
> +  struct {void *a;};
> +
> +  void * is special cased and works as a universal pointer type.
> +  Accesses to it conflicts with accesses to any other pointer
> +  type.  */
> 
> This problem manifested itself only as a lto-bootstrap miscompare on 32bit
> build and I spent a lot of time localizing the wrong code since it reproduces
> only in quite large programs where we get conficts in canonical type merging
> like this.
> 
> The patch thus updates record_component_aliases to substitute 
> void_ptr_type for all pointer types. I re-did the stats.  Now the 
> improvement on dealII is 14% that is quite a bit lower than earlier, but 
> still substantial.  Since we have voidptr globing counter, I know that 
> the number of disambiguations would go at least 19% up if we did not do 
> it.

Please in future leave patches for review again if you do such
big changes before committing...

I don't understand why we need this (testcase?) because get_alias_set ()
is supposed to do the alias-set of pointer globbing for us.

Thanks,
Richard.

> THere is a lot of low hanging fruit in that area now, but the real 
> solution is to track types that needs to be merge by fortran rules and 
> don't do all this fancy globing for C/C++ types.  I will open branch for 
> IPA work and try to prepare this for next stage 1.
> 
> bootstrapped/regtested x86_64-linux and ppc64-linux, earlier version tested 
> on i386-linux
> and also on some bigger apps, committed
> 
> Note that we still have bootstrap miscompare with LTO build and 
> --disable-checking,
> I am looking for that now.  Additoinally after fixing the ICEs preventing us 
> to build
> the gnat1 binary, gnat1 aborts. Both these are independent of the patch.
> 
> Honza
>   * lto.c (iterative_hash_canonical_type): Always recurse for pointers.
>   (gimple_register_canonical_type_1): Check that pointers do not get
>   canonical types.
>   (gimple_register_canonical_type): Do not register pointers.
> 
>   * tree.c (build_pointer_type_for_mode,build_reference_type_for_mode):
>   In LTO we do not compute TYPE_CANONICAL of pointers.
>   (gimple_canonical_types_compatible_p): Improve coments; sanity check
>   that pointers do not have canonical type that would make us believe
>   they are different.
>   * alias.c (get_alias_set): Do structural type equality on pointers;
>   enable pointer path for LTO; also glob pointer to vector with pointer
>   to vector element; glob pointers and references for LTO; do more strict
>   sanity checking about build_pointer_type returning the canonical type
>   which is also the main variant.
>   (record_component_aliases): When component type is pointer and we
>   do LTO; record void_type_node alias set.
> Index: tree.c
> ===
> --- tree.c(revision 230714)
> +++ tree.c(working copy)
> @@ -7919,7 +7919,8 @@ build_pointer_type_for_mode (tree to_typ
>TYPE_NEXT_PTR_TO (t) = TYPE_POINTER_TO (to_type);
>TYPE_POINTER_TO (to_type) = t;
>  
> -  if (TYPE_STRUCTURAL_EQUALITY_P (to_type))
> +  /* During LTO we do not set TYPE_CANONICAL of pointers and references.  */
> +  if (TYPE_STRUCTURAL_EQUALITY_P (to_type) || in_lto_p)
>  SET_TYPE_STRUCTURAL_EQUALITY (t);
>else if (TYPE_CANONICAL (to_type) != to_type || could_alias)
>  TYPE_CANONICAL (t)
> @@ -7987,7 +7988,8 @@ build_reference_type_for_mode (tree to_t
>TYPE_NEXT_REF_TO (t) = TYPE_REFERENCE_TO (to_type);
>TYPE_REFERENCE_TO (to_type) = t;
>  
> -  if (TYPE_STRUCTURAL_EQUALITY_P (to_type))
> +  /* During LTO we do not set TYPE_CANONICAL of pointers and references.  */
> +  if (TYPE_STRUCTURAL_EQUALITY_P (to_type) || in_lto_p)
>  SET_TYPE_STRUCTURAL_EQUALITY (t);
>else if (TYPE_CANONICAL (to_type) != to_type || could_alias)
>  TYPE_CANONICAL (t)
> @@ -13224,7 +13226,9 @@ type_with_interoperable_signedness (cons
> TBAA is concerned.  
> This function is used both by lto.c canonical type merging and by the
> verifier.  If TRUST_TYPE_CANONICAL we do not look into structure of types
> -   that have TYPE_CANONICAL defined and assume them equivalent.  */
> +   that have TYPE_CANONICAL defined and assume them equivalent.  This

Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Richard Biener

On Mon, Nov 23, 2015 at 11:10 AM, Ilya Enkovich  wrote:
> On 23 Nov 10:39, Richard Biener wrote:
>> On Fri, Nov 20, 2015 at 3:30 PM, Ilya Enkovich  
>> wrote:
>> > On 20 Nov 14:54, Richard Biener wrote:
>> >> On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich  
>> >> wrote:
>> >> > On 19 Nov 18:19, Richard Biener wrote:
>> >> >> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
>> >> >>  wrote:
>> >> >> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
>> >> >> >> Currently we fold all memcpy/memmove calls with a known data size.
>> >> >> >> It causes two problems when used with Pointer Bounds Checker.
>> >> >> >> The first problem is that we may copy pointers as integer data
>> >> >> >> and thus loose bounds.  The second problem is that if we inline
>> >> >> >> memcpy, we also have to inline bounds copy and this may result
>> >> >> >> in a huge amount of code and significant compilation time growth.
>> >> >> >> This patch disables folding for functions we want to instrument.
>> >> >> >>
>> >> >> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
>> >> >> >> and regtested on x86_64-unknown-linux-gnu.
>> >> >> >
>> >> >> >Can't see anything wrong with it. Ok.
>> >> >>
>> >> >> But for small sizes this can have a huge impact on optimization.  
>> >> >> Which is why we have the code in the first place.  I'd make the check 
>> >> >> less broad, for example inlining copies of size less than a pointer 
>> >> >> shouldn't be affected.
>> >> >
>> >> > Right.  We also may inline in case we know no pointers are copied.  
>> >> > Below is a version with extended condition and a couple more tests.  
>> >> > Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Does it OK for 
>> >> > trunk and gcc-5-branch?
>> >> >
>> >> >>
>> >> >> Richard.
>> >> >>
>> >> >> >
>> >> >> >Bernd
>> >> >>
>> >> >>
>> >> >
>> >> > Thanks,
>> >> > Ilya
>> >> > --
>> >> > gcc/
>> >> >
>> >> > 2015-11-20  Ilya Enkovich  
>> >> >
>> >> > * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
>> >> > fold call if we are going to instrument it and it may
>> >> > copy pointers.
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> > 2015-11-20  Ilya Enkovich  
>> >> >
>> >> > * gcc.target/i386/mpx/pr68337-1.c: New test.
>> >> > * gcc.target/i386/mpx/pr68337-2.c: New test.
>> >> > * gcc.target/i386/mpx/pr68337-3.c: New test.
>> >> >
>> >> >
>> >> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
>> >> > index 1ab20d1..dd9f80b 100644
>> >> > --- a/gcc/gimple-fold.c
>> >> > +++ b/gcc/gimple-fold.c
>> >> > @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
>> >> >  #include "gomp-constants.h"
>> >> >  #include "optabs-query.h"
>> >> >  #include "omp-low.h"
>> >> > +#include "tree-chkp.h"
>> >> > +#include "ipa-chkp.h"
>> >> >
>> >> >
>> >> >  /* Return true when DECL can be referenced from current unit.
>> >> > @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op 
>> >> > (gimple_stmt_iterator *gsi,
>> >> >unsigned int src_align, dest_align;
>> >> >tree off0;
>> >> >
>> >> > +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
>> >> > +pointers as wide integer) and also may result in huge function
>> >> > +size because of inlined bounds copy.  Thus don't inline for
>> >> > +functions we want to instrument in case pointers are copied.  
>> >> > */
>> >> > +  if (flag_check_pointer_bounds
>> >> > + && chkp_instrumentable_p (cfun->decl)
>> >> > + /* Even if data may contain pointers we can inline if copy
>> >> > +less than a pointer size.  */
>> >> > + && (!tree_fits_uhwi_p (len)
>> >> > + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
>> >>
>> >> || tree_to_uhwi (len) >= POINTER_SIZE_UNITS
>> >>
>> >> > + /* Check data type for pointers.  */
>> >> > + && (!TREE_TYPE (src)
>> >> > + || !TREE_TYPE (TREE_TYPE (src))
>> >> > + || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
>> >> > + || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
>> >>
>> >> I don't think you can in any way rely on the pointer type of the src 
>> >> argument
>> >> as all pointer conversions are useless and memcpy and friends take void *
>> >> anyway.
>> >
>> > This check is looking for cases when we have type information indicating
>> > no pointers are copied.  In case of 'void *' we have to assume pointers
>> > are copied and inlining is undesired.  Test pr68337-2.c checks pointer
>> > type allows to enable inlining.  Looks like this check misses
>> > || !COMPLETE_TYPE_P(TREE_TYPE (TREE_TYPE (src)))?
>>
>> As said there is no information in the pointer / pointed-to type in GIMPLE.
>
> What does it mean?  We do have TREE_TYPE for used pointer and nested TREE_TYPE
> holding pointed-to type.  Is it some

[Bug target/68494] [ARM] Use vector multiply by lane

2015-11-23 Thread ramana at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68494

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2015-11-23
 CC||ramana at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Ramana Radhakrishnan  ---
NTAPS is undefined.

What's the current output and what output do you expect ?

[Patch] PR68137, drop constant overflow flag in adjust_range_with_scev when possible

2015-11-23 Thread Jiong Wang


As reported by pr68137 and pr68326, r230150 caused new issues.

Those ICEs are caused by adjust_range_with_scev getting range with
overflowed constants min or max. So given there are too many places to
generate OVF, we do a check in adjust_range_with_scev, to drop OVF flag
when it's uncessary. This should fix the OVF side-effect caused by
r230150.

A simple regression testcase is included in this patch.

bootstrap OK on x86-64 and aarch64, regression ok on both.

For more background, please see discussion at

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68317

OK for trunk?

2015-11-23  Richard Biener  
Jiong Wang  

gcc/
  PR tree-optimization/68137
  PR tree-optimization/68326
  * tree-vrp.c (adjust_range_with_scev): Call drop_tree_overflow if the
  final min and max are not infinity.

gcc/testsuite/
  * gcc.dg/pr68139.c: New testcase.

--
Regards,
Jiong

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e2393e4..8efeb76 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -4331,6 +4331,17 @@ adjust_range_with_scev (value_range *vr, struct loop *loop,
 	  && is_positive_overflow_infinity (max)))
 return;
 
+  /* Even for valid range info, sometimes overflow flag will leak in.
+ As GIMPLE IL should have no constants with TREE_OVERFLOW set, we
+ drop them except for +-overflow_infinity which still need special
+ handling in vrp pass.  */
+  if (TREE_OVERFLOW_P (min)
+  && ! is_negative_overflow_infinity (min))
+min = drop_tree_overflow (min);
+  if (TREE_OVERFLOW_P (max)
+  && ! is_positive_overflow_infinity (max))
+max = drop_tree_overflow (max);
+
   set_value_range (vr, VR_RANGE, min, max, vr->equiv);
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr68137.c b/gcc/testsuite/gcc.dg/pr68137.c
new file mode 100644
index 000..a30e1ac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68137.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void bar (int);
+
+void
+foo ()
+{
+ int index = 0;
+ for (index; index <= 10; index--)
+   /* Result of the following multiply will overflow
+  when converted to signed int.  */
+   bar((0xcafe + index) * 0xdead);
+}

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Eric Botcazou

> So there is indeed no point in trying to fix one or two cases, and we should
> instead instruct LTO somehow to treat System.Address is compatible with
> void* otherwise we'll run into endless troubles on that since using
> System.Address as void* is very common practice in Ada code.

Maybe we could apply this special treatment only to the void_ptr subtype of 
Interfaces.C.Extensions and require its use when interfacing with C.

-- 
Eric Botcazou

Re: [PATCH] Fix up reduction-1{1,2} testcases (PR middle-end/68221)

2015-11-23 Thread Richard Biener

On Fri, 20 Nov 2015, Jakub Jelinek wrote:

> Hi!
> 
> If C/C++ array section reductions have non-zero (positive) bias, it is
> implemented by declaring a smaller private array and subtracting the bias
> from the start of the private array (because valid code may only dereference
> elements from bias onwards).  But, this isn't something that is kosher in
> C/C++ pointer arithmetics and the alias oracle seems to get upset on that.
> So, the following patch fixes that by performing the subtraction on integral
> type instead of p+ -bias.

So this still does use the biased pointer because you do not
re-write accesses (where you could have applied the biasing to
the indexes / offsets), right?  Thus the patch is merely obfuscation
for GCC rather than making it kosher for C/C++ (you still have a
pointer pointing outside of the private array object)?

I still hope to have a look where the alias oracle gets things
wrong (well, if so by accident at least).

Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
> 
> 2015-11-20  Jakub Jelinek  
> 
>   PR middle-end/68221
>   * omp-low.c (lower_rec_input_clauses): If C/C++ array reduction
>   has non-zero bias, subtract it in integer type instead of
>   pointer plus of negated bias.
> 
>   * testsuite/libgomp.c/reduction-11.c: Remove xfail.
>   * testsuite/libgomp.c/reduction-12.c: Likewise.
>   * testsuite/libgomp.c++/reduction-11.C: Likewise.
>   * testsuite/libgomp.c++/reduction-12.C: Likewise.
> 
> --- gcc/omp-low.c.jj  2015-11-20 12:56:17.0 +0100
> +++ gcc/omp-low.c 2015-11-20 13:44:29.080374051 +0100
> @@ -,11 +,13 @@ lower_rec_input_clauses (tree clauses, g
>  
> if (!integer_zerop (bias))
>   {
> -   bias = fold_convert_loc (clause_loc, sizetype, bias);
> -   bias = fold_build1_loc (clause_loc, NEGATE_EXPR,
> -   sizetype, bias);
> -   x = fold_build2_loc (clause_loc, POINTER_PLUS_EXPR,
> -TREE_TYPE (x), x, bias);
> +   bias = fold_convert_loc (clause_loc, pointer_sized_int_node,
> +bias);
> +   yb = fold_convert_loc (clause_loc, pointer_sized_int_node,
> +  x);
> +   yb = fold_build2_loc (clause_loc, MINUS_EXPR,
> + pointer_sized_int_node, yb, bias);
> +   x = fold_convert_loc (clause_loc, TREE_TYPE (x), yb);
> yb = create_tmp_var (ptype, name);
> gimplify_assign (yb, x, ilist);
> x = yb;
> --- libgomp/testsuite/libgomp.c/reduction-11.c.jj 2015-11-05 
> 16:03:53.0 +0100
> +++ libgomp/testsuite/libgomp.c/reduction-11.c2015-11-20 
> 13:38:24.448520879 +0100
> @@ -1,4 +1,4 @@
> -/* { dg-do run { xfail *-*-* } } */
> +/* { dg-do run } */
>  
>  char z[10] = { 0 };
>  
> --- libgomp/testsuite/libgomp.c/reduction-12.c.jj 2015-11-05 
> 16:03:53.0 +0100
> +++ libgomp/testsuite/libgomp.c/reduction-12.c2015-11-20 
> 13:38:34.565378078 +0100
> @@ -1,4 +1,4 @@
> -/* { dg-do run { xfail *-*-* } } */
> +/* { dg-do run } */
>  
>  struct A { int t; };
>  struct B { char t; };
> --- libgomp/testsuite/libgomp.c++/reduction-11.C.jj   2015-11-05 
> 16:03:53.0 +0100
> +++ libgomp/testsuite/libgomp.c++/reduction-11.C  2015-11-20 
> 13:37:53.921951766 +0100
> @@ -1,4 +1,4 @@
> -// { dg-do run { xfail *-*-* } }
> +// { dg-do run }
>  
>  char z[10] = { 0 };
>  
> --- libgomp/testsuite/libgomp.c++/reduction-12.C.jj   2015-11-05 
> 16:03:53.0 +0100
> +++ libgomp/testsuite/libgomp.c++/reduction-12.C  2015-11-20 
> 13:38:03.983809741 +0100
> @@ -1,4 +1,4 @@
> -// { dg-do run { xfail *-*-* } }
> +// { dg-do run }
>  
>  template 
>  struct A
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread Joseph Myers

On Sun, 22 Nov 2015, David Malcolm wrote:

> Is there (or could there be) a precanned dg- directive to ask if ObjC is
> available?  

I don't think so.  Normal practice is that each language's tests are in 
appropriate directories for that language, with runtest never called with 
a --tool option for that language if it wasn't built.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Ada] Introduce a Frontend_Exceptions flag in system.ads

2015-11-23 Thread Olivier Hainque


> On Nov 23, 2015, at 12:02 , Olivier Hainque  wrote:
> Then all the system.ads files will be updated with a correct value of the
> Frontend_Exceptions flags.

Here's the patch.



eh-flags-rts.diff
Description: Binary data

[Bug c++/67550] [5/6 regression] Initialization of local struct array with elements of global array yields zeros instead of initializer values

2015-11-23 Thread jwyatt at feralinteractive dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67550

--- Comment #4 from Jason Wyatt  ---
It appears that while parsing the initialiser for the array,
maybe_constant_init switches the var for a constructor. This constructor only
sets the m2 member variable. You can see the result in the gimple it produces:

testValue = 1;
var = {};
var.m2 = 2;
var.m1 = testValue;
array = {};
array[0].m2 = 2;

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Eric Botcazou

> You are right, TYPE_NONALIASED_COMPONENT is the wrong way.  I will fix it
> and try to come up with a testcase (TYPE_NONALIASED_COMPONENT is quite
> rarely used beast)

It's only used in Ada as far as I know, but is quite sensitive and quickly 
leads to wrong code if not handled properly in my experience, so this could 
well be responsible for the gnat1 miscompilation.

-- 
Eric Botcazou

[Bug tree-optimization/68493] [6 Regression] [graphite] ICE in copy_loop_phi_args

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68493

Richard Biener  changed:

   What|Removed |Added

 CC||spop at gcc dot gnu.org
   Target Milestone|--- |6.0

[Bug target/68483] [5/6 Regression] gcc 5.2: suboptimal code compared to 4.9

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68483

Richard Biener  changed:

   What|Removed |Added

 Target||i?86-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-11-23
  Component|other   |target
 Blocks||53947
   Target Milestone|--- |5.3
Summary|gcc 5.2: suboptimal code|[5/6 Regression] gcc 5.2:
   |compared to 4.9 |suboptimal code compared to
   ||4.9
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
Hum, on x86_64 I don't see either GCC 4.9 or GCC 5.2 vectorize the function at
all because they fail to analyze the evolution of the dataref for input[j] as
the initial j of the inner loop is not propagated as zero.

With i?86 I can confirm your observation but I don't see it fixed on trunk.

Note that this boils down to vector shift detection of permutes where (IIRC)
some patterns were not properly guarded on SSE3 support previously and a
wrong-code bug was fixed conservatively on the GCC 5 branch while missing
support was only implemented on trunk.

The failure to vectorize on x86_64 isn't a regression.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug driver/68463] Offloading fails when some objects are compiled with LTO and some without

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463

Richard Biener  changed:

   What|Removed |Added

   Keywords||lto
  Component|other   |driver

--- Comment #1 from Richard Biener  ---
> Or maybe just print an error during linking that offloading doesn't support
> mixing LTO and non-LTO objects (even if some of them doesn't have offload)?

That's the worst solution - having non-LTO objects is the whole point of
linker-plugin support.

I presume the same issue exists for GCC 5.

Re: [PATCH,RFC] Introduce RUN_UNDER_VALGRIND in test-suite

2015-11-23 Thread Martin Liška

On 11/21/2015 05:26 AM, Hans-Peter Nilsson wrote:
> On Thu, 19 Nov 2015, Martin Li?ka wrote:
>> Hello.
>>
>> In last two weeks I've removed couple of memory leaks, mainly tight to 
>> middle-end.
>> Currently, a user of the GCC compiler can pass '--enable-checking=valgrind' 
>> configure option
>> that will run all commands within valgrind environment, but as the valgrind 
>> runs just with '-q' option,
>> the result is not very helpful.
>>
>> I would like to start with another approach, where we can run all tests in 
>> test-suite
>> within the valgrind sandbox and return an exit code if there's an error seen 
>> by the tool.
>> That unfortunately leads to many latent (maybe false positives, FE issues, 
>> ...) that can
>> be efficiently ignored by valgrind suppressions file (the file is part of 
>> suggested patch).
>>
>> The first version of the valgrind.supp can survive running compilation of 
>> tramp3d with -O2
>> and majority of tests in test-suite can successfully finish. Most of memory 
>> leaks
>> mentioned in the file can be eventually fixed.
> 
> I didn't quite understand the need for the suppression files.
> Is it like Markus said, only because valgrind annotations are
> not on by default?  Then let's change it so that's the default
> during DEV-PHASE = experimental (the development phase) or
> prerelease.  I really thought that was the case by now.
> (The suppression files are IMHO a useful addition to contrib/
> either way.)

Hi.

Well, the original motivation was to basically to fill up the file with all 
common
errors (known issues) and to fix all newly introduced issues. That can minimize
the number of errors reported by the tool.

However, as I run complete test-suite for all default languages, I've seen:

== Statistics ==
Total number of errors: 249615
Number of different errors: 5848

Where two errors are different if they produce either different message or 
back-backtrace.
For complete list of errors (sorted by # of occurrences), download:

https://docs.google.com/uc?authuser=0=0B0pisUJ80pO1MENrWXBzak5naFk=download

> 
>> As I noticed in results log files, most of remaining issues are connected to 
>> gcc.c and
>> lto-wrapper.c files. gcc.c heavily manipulates with strings and it would 
>> probably require
>> usage of a string pool, that can easily eventually removed (just in case of 
>> --enable-valgrind-annotations).
>> The second source file tends to produce memory leaks because of fork/exec 
>> constructs. However both
>> can be improved during next stage1.
>>
>> Apart from aforementioned issues, the compiler does not contain so many 
>> issues and I think it's
>> doable to prune them and rely on reported valgrind errors.
>>
>> Patch touches many .exp files, but basically does just couple of 
>> modifications:
>>
>> 1) gcc-defs.exp introduces new global variable run_under_valgrind
>> 2) new procedure dg-run-valgrind distinguishes between just passing options 
>> to 'gd-test',
>>or runs 'dg-test' with additional flags that enable valgrind (using 
>> -wrapper)
>> 3) new procedure dg-runtest-valgrind does the similar
>> 4) many changes in corresponding *.exp files that utilize these procedures
>>
>> The patch should be definitely part of next stage1, but I would appreciate 
>> any thoughts
>> about the described approach?
> 
> IIRC you can replace the actual dg-runtest proc with your own
> (implementing a wrapper).  Grep aroung, I think we do that
> already.  That's certainly preferable instead of touching all
> callers.

You are right, the suggested patch was over-kill, wrapper should be fine for 
that.
Currently I've been playing with a bit different approach (suggested by Markus),
where I would like to enable valgrind in gcc.c using an environmental variable.

Question is if it should replace existing ENABLE_VALGRIND_CHECKING and how to
integrate it with a valgrind suppressions file?

Ideas are highly welcomed.

Thanks,
Martin

> 
>>
>> Thank you,
>> Martin
> 
> brgds, H-P
> 

>From f0b211e4194e11e5ad52fa3b295a62f67b4060b8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 20 Nov 2015 09:46:09 +0100
Subject: [PATCH] Initial version of valgrind wrapper

---
 contrib/gcc.supp | 108 +++
 gcc/gcc.c|  51 +++---
 2 files changed, 139 insertions(+), 20 deletions(-)
 create mode 100644 contrib/gcc.supp

diff --git a/contrib/gcc.supp b/contrib/gcc.supp
new file mode 100644
index 000..deefb28
--- /dev/null
+++ b/contrib/gcc.supp
@@ -0,0 +1,108 @@
+{
+   cpp_get_buff
+   Memcheck:Leak
+   match-leak-kinds: possible
+   fun:malloc
+   fun:xmalloc
+   fun:new_buff
+   fun:_cpp_get_buff
+   ...
+}
+{
+   gnu-as
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:malloc
+   ...
+   obj:/usr/bin/as
+   ...
+}
+{
+   gnu-as
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:calloc
+   fun:xcalloc
+   ...
+   obj:/usr/bin/as
+   ...
+}
+{
+   todo-fix-mpfr
+

Re: [PATCH, PR68460] Always call free_stmt_vec_info_vec in gather_scalar_reductions

2015-11-23 Thread Richard Biener

On Fri, Nov 20, 2015 at 4:57 PM, Tom de Vries  wrote:
> [ was: Re: [PATCH] Fix parloops gimple_uid usage ]
>
> On 09/10/15 23:09, Tom de Vries wrote:
>>
>> @@ -2392,6 +2397,9 @@ gather_scalar_reductions (loop_p loop,
>> reduction_info_table_type *reduction_list
>> loop_vec_info simple_inner_loop_info = NULL;
>> bool allow_double_reduc = true;
>>
>> +  if (!stmt_vec_info_vec.exists ())
>> +init_stmt_vec_info_vec ();
>> +
>> simple_loop_info = vect_analyze_loop_form (loop);
>> if (simple_loop_info == NULL)
>>   return;
>> @@ -2453,9 +2461,16 @@ gather_scalar_reductions (loop_p loop,
>> reduction_info_table_type *reduction_list
>> destroy_loop_vec_info (simple_loop_info, true);
>> destroy_loop_vec_info (simple_inner_loop_info, true);
>>
>> +  /* Release the claim on gimple_uid.  */
>> +  free_stmt_vec_info_vec ();
>> +
>
>
> With the src/libgomp/testsuite/libgomp.c/pr46886.c testcase, compiled in
> addition with -ftree-vectorize, I ran into an ICE:
> ...
> src/libgomp/testsuite/libgomp.c/pr46886.c:8:5: internal compiler error: in
> init_stmt_vec_info_vec, at tree-vect-stmts.c:8250
>  int foo (void)
>  ^~~
>
> 0x1196082 init_stmt_vec_info_vec()
> src/gcc/tree-vect-stmts.c:8250
> 0x11c3ed4 vectorize_loops()
> src/gcc/tree-vectorizer.c:510
> 0x10a7ea5 execute
> src/gcc/tree-ssa-loop.c:276
> ...
>
> The ICE is caused by the fact that init_stmt_vec_info_vec is called at the
> start of vectorize_loops, while stmt_vec_info_vec is not empty. I traced
> this back to gather_scalar_reduction, where we call init_stmt_vec_info_vec,
> but we skip free_stmt_vec_info_vec if we take the early-out for
> simple_loop_info == NULL.
>
> This patch fixes the ICE by making sure we always call
> free_stmt_vec_info_vec in gather_scalar_reduction.
>
> Passes cc1/f951 rebuild and autopar testing.
>
> OK for stage3 trunk if bootstrap and regtest succeeds?

Ok.

Richard.

> Thanks,
> - Tom

[Bug tree-optimization/68460] ICE in init_stmt_vec_info_vec with -ftree-vectorize and -ftree-parallelize-loops

2015-11-23 Thread vries at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68460

--- Comment #2 from vries at gcc dot gnu.org ---
Author: vries
Date: Mon Nov 23 09:45:38 2015
New Revision: 230742

URL: https://gcc.gnu.org/viewcvs?rev=230742=gcc=rev
Log:
Always call free_stmt_vec_info_vec in gather_scalar_reductions

2015-11-23  Tom de Vries  

PR tree-optimization/68460
* tree-parloops.c (gather_scalar_reductions): Also call
free_stmt_vec_info_vec if simple_loop_info == NULL.

* gcc.dg/autopar/pr68460.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/autopar/pr68460.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-parloops.c

Re: [Patch, vrp] Allow VRP type conversion folding only for widenings upto word mode

2015-11-23 Thread Richard Biener

On Fri, 20 Nov 2015, Jeff Law wrote:

> On 11/20/2015 10:04 AM, Senthil Kumar Selvaraj wrote:
> > On Thu, Nov 19, 2015 at 10:31:41AM -0700, Jeff Law wrote:
> > > On 11/18/2015 11:20 PM, Senthil Kumar Selvaraj wrote:
> > > > On Wed, Nov 18, 2015 at 09:36:21AM +0100, Richard Biener wrote:
> > > > > 
> > > > > Otherwise ok.
> > > > 
> > > > See modified patch below. If you think vrp98.c is unnecessary, feel free
> > > > to dump it :).
> > > > 
> > > > If ok, could you commit it for me please? I don't have commit access.
> > > > 
> > > > Regards
> > > > Senthil
> > > > 
> > > > gcc/ChangeLog
> > > > 2015-11-19  Senthil Kumar Selvaraj  
> > > > 
> > > > * tree.h (desired_pro_or_demotion_p): New function.
> > > > * tree-vrp.c (simplify_cond_using_ranges): Call it.
> > > > 
> > > > gcc/testsuite/ChangeLog
> > > > 2015-11-19  Senthil Kumar Selvaraj  
> > > > 
> > > > * gcc.dg/tree-ssa/vrp98.c: New testcase.
> > > > * gcc.target/avr/uint8-single-reg.c: New testcase.
> > > I went ahead and committed this as-is.
> > > 
> > > I do think the vrp98 testcase is useful as it verifies that VRP is doing
> > > what we want in a target independent way.  It's a good complement to the
> > > AVR
> > > specific testcase.
> > 
> > I see the same problem on gcc-5-branch as well. Would it be ok to
> > backport the fix to that branch as well?
> That's a call for the release managers.  I typically don't backport anything
> expect ICE or incorrect code generation fixes as I tend to be very
> conservative on what goes onto a release branch.
> 
> Jakub, Richi or Joseph would need to ack into a release branch.

As this is fixes a regression it qualifies in principle.  But as
it is an optimization regression only I'd prefer to wait a bit to look
for fallout.

Richard.

> jeff
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[Bug debug/66432] [4.9 Regression] libgomp.c/appendix-a/a.29.1.c -O2 -g: type mismatch between an SSA_NAME and its symbol

2015-11-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66432

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|openmp  |
Summary|[4.9/5/6 Regression]|[4.9 Regression]
   |libgomp.c/appendix-a/a.29.1 |libgomp.c/appendix-a/a.29.1
   |.c -O2 -g: type mismatch|.c -O2 -g: type mismatch
   |between an SSA_NAME and its |between an SSA_NAME and its
   |symbol  |symbol

--- Comment #11 from Jakub Jelinek  ---
Fixed for 5.3+ so far.

Re: [PATCH] Fix GC ICE during simd clone creation (PR middle-end/68339)

2015-11-23 Thread Richard Biener

On Fri, Nov 20, 2015 at 9:03 PM, Jakub Jelinek  wrote:
> Hi!
>
> node->get_body () can run various IPA passes and ggc_collect in them

Aww.  Looks like we never implemented that ggc_defer_collecting idea
(don't remember the context this popped up, maybe it was when we
introduced TODO_do_not_ggc_collect).  At least late IPA passes
might be affected by this issue as well.

Richard.

>, so
> it is undesirable to hold pointers to GC memory in automatic vars over it.
> While I could store those vars (clone_info, clone and id) into special GTY
> vars just to avoid collecting them, it seems easier to call node->get_body
> () earlier.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
> and 5 branch.
>
> 2015-11-20  Jakub Jelinek  
>
> PR middle-end/68339
> * omp-low.c (expand_simd_clones): Call node->get_body () before
> allocating stuff in GC.
>
> * gcc.dg/vect/pr68339.c: New test.
>
> --- gcc/omp-low.c.jj2015-11-18 11:19:19.0 +0100
> +++ gcc/omp-low.c   2015-11-20 12:56:17.075193601 +0100
> @@ -18319,6 +18319,10 @@ expand_simd_clones (struct cgraph_node *
>&& TYPE_ARG_TYPES (TREE_TYPE (node->decl)) == NULL_TREE)
>  return;
>
> +  /* Call this before creating clone_info, as it might ggc_collect.  */
> +  if (node->definition && node->has_gimple_body_p ())
> +node->get_body ();
> +
>do
>  {
>/* Start with parsing the "omp declare simd" attribute(s).  */
> --- gcc/testsuite/gcc.dg/vect/pr68339.c.jj  2015-11-20 13:10:47.756905395 
> +0100
> +++ gcc/testsuite/gcc.dg/vect/pr68339.c 2015-11-20 13:08:13.0 +0100
> @@ -0,0 +1,17 @@
> +/* PR middle-end/68339 */
> +/* { dg-do compile } */
> +/* { dg-options "--param ggc-min-heapsize=0 --param ggc-min-expand=0 
> -fopenmp-simd" } */
> +
> +#pragma omp declare simd notinbranch
> +int
> +f1 (int x)
> +{
> +  return x;
> +}
> +
> +#pragma omp declare simd notinbranch
> +int
> +f2 (int x)
> +{
> +  return x;
> +}
>
> Jakub

[Bug tree-optimization/68327] [6 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vect_is_simple_use, at tree-vect-stmts.c:8562

2015-11-23 Thread ienkovich at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68327

--- Comment #4 from Ilya Enkovich  ---
Author: ienkovich
Date: Mon Nov 23 10:01:51 2015
New Revision: 230743

URL: https://gcc.gnu.org/viewcvs?rev=230743=gcc=rev
Log:
gcc/

PR tree-optimization/68327
* tree-vect-loop.c (vect_determine_vectorization_factor): Don't
compute vectype for non-relevant mask producers.
* gcc/tree-vect-stmts.c (vectorizable_comparison): Check stmt
relevance earlier.

gcc/testsuite/

PR tree-optimization/68327
* gcc.dg/pr68327.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/pr68327.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vect-stmts.c

Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-23 Thread Richard Biener

On Fri, 20 Nov 2015, Tom de Vries wrote:

> On 20/11/15 14:29, Richard Biener wrote:
> > I agree it's somewhat of an odd behavior but all passes should
> > either be placed in a sub-pipeline with an outer
> > loop_optimizer_init()/finalize () call or call both themselves.
> 
> Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the loop
> pipeline.
> 
> We could use the style used in pass_slp_vectorize::execute:
> ...
> pass_slp_vectorize::execute (function *fun)
> {
>   basic_block bb;
> 
>   bool in_loop_pipeline = scev_initialized_p ();
>   if (!in_loop_pipeline)
> {
>   loop_optimizer_init (LOOPS_NORMAL);
>   scev_initialize ();
> }
> 
>   ...
> 
>   if (!in_loop_pipeline)
> {
>   scev_finalize ();
>   loop_optimizer_finalize ();
> }
> ...
> 
> Although that doesn't strike me as particularly clean.

At least it would be a consistent "unclean" style.  So yes, the
above would work for me.

Thanks,
Richard.

Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Ilya Enkovich

On 23 Nov 10:39, Richard Biener wrote:
> On Fri, Nov 20, 2015 at 3:30 PM, Ilya Enkovich  wrote:
> > On 20 Nov 14:54, Richard Biener wrote:
> >> On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich  
> >> wrote:
> >> > On 19 Nov 18:19, Richard Biener wrote:
> >> >> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
> >> >>  wrote:
> >> >> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> >> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> >> >> It causes two problems when used with Pointer Bounds Checker.
> >> >> >> The first problem is that we may copy pointers as integer data
> >> >> >> and thus loose bounds.  The second problem is that if we inline
> >> >> >> memcpy, we also have to inline bounds copy and this may result
> >> >> >> in a huge amount of code and significant compilation time growth.
> >> >> >> This patch disables folding for functions we want to instrument.
> >> >> >>
> >> >> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> >> >> and regtested on x86_64-unknown-linux-gnu.
> >> >> >
> >> >> >Can't see anything wrong with it. Ok.
> >> >>
> >> >> But for small sizes this can have a huge impact on optimization.  Which 
> >> >> is why we have the code in the first place.  I'd make the check less 
> >> >> broad, for example inlining copies of size less than a pointer 
> >> >> shouldn't be affected.
> >> >
> >> > Right.  We also may inline in case we know no pointers are copied.  
> >> > Below is a version with extended condition and a couple more tests.  
> >> > Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Does it OK for 
> >> > trunk and gcc-5-branch?
> >> >
> >> >>
> >> >> Richard.
> >> >>
> >> >> >
> >> >> >Bernd
> >> >>
> >> >>
> >> >
> >> > Thanks,
> >> > Ilya
> >> > --
> >> > gcc/
> >> >
> >> > 2015-11-20  Ilya Enkovich  
> >> >
> >> > * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
> >> > fold call if we are going to instrument it and it may
> >> > copy pointers.
> >> >
> >> > gcc/testsuite/
> >> >
> >> > 2015-11-20  Ilya Enkovich  
> >> >
> >> > * gcc.target/i386/mpx/pr68337-1.c: New test.
> >> > * gcc.target/i386/mpx/pr68337-2.c: New test.
> >> > * gcc.target/i386/mpx/pr68337-3.c: New test.
> >> >
> >> >
> >> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> >> > index 1ab20d1..dd9f80b 100644
> >> > --- a/gcc/gimple-fold.c
> >> > +++ b/gcc/gimple-fold.c
> >> > @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "gomp-constants.h"
> >> >  #include "optabs-query.h"
> >> >  #include "omp-low.h"
> >> > +#include "tree-chkp.h"
> >> > +#include "ipa-chkp.h"
> >> >
> >> >
> >> >  /* Return true when DECL can be referenced from current unit.
> >> > @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> >> > *gsi,
> >> >unsigned int src_align, dest_align;
> >> >tree off0;
> >> >
> >> > +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> >> > +pointers as wide integer) and also may result in huge function
> >> > +size because of inlined bounds copy.  Thus don't inline for
> >> > +functions we want to instrument in case pointers are copied.  */
> >> > +  if (flag_check_pointer_bounds
> >> > + && chkp_instrumentable_p (cfun->decl)
> >> > + /* Even if data may contain pointers we can inline if copy
> >> > +less than a pointer size.  */
> >> > + && (!tree_fits_uhwi_p (len)
> >> > + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
> >>
> >> || tree_to_uhwi (len) >= POINTER_SIZE_UNITS
> >>
> >> > + /* Check data type for pointers.  */
> >> > + && (!TREE_TYPE (src)
> >> > + || !TREE_TYPE (TREE_TYPE (src))
> >> > + || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
> >> > + || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
> >>
> >> I don't think you can in any way rely on the pointer type of the src 
> >> argument
> >> as all pointer conversions are useless and memcpy and friends take void *
> >> anyway.
> >
> > This check is looking for cases when we have type information indicating
> > no pointers are copied.  In case of 'void *' we have to assume pointers
> > are copied and inlining is undesired.  Test pr68337-2.c checks pointer
> > type allows to enable inlining.  Looks like this check misses
> > || !COMPLETE_TYPE_P(TREE_TYPE (TREE_TYPE (src)))?
> 
> As said there is no information in the pointer / pointed-to type in GIMPLE.

What does it mean?  We do have TREE_TYPE for used pointer and nested TREE_TYPE
holding pointed-to type.  Is it some random invalid type?

> 
> >>
> >> Note that you also disable memmove to memcpy simplification with this
> >> early check.
> >
> > Doesn't matter for MPX which uses the same implementation for both cases.
>

Re: RFA: PATCH to match.pd for c++/68385

2015-11-23 Thread Richard Biener

On Sat, Nov 21, 2015 at 7:57 PM, Marc Glisse  wrote:
> On Sat, 21 Nov 2015, Richard Biener wrote:
>
>> On November 20, 2015 8:58:15 PM GMT+01:00, Jason Merrill
>>  wrote:
>>>
>>> In this bug, we hit the (A & sign-bit) != 0 -> A < 0 transformation.
>>> Because of delayed folding, the operands aren't fully folded yet, so we
>>>
>>> have NOP_EXPRs around INTEGER_CSTs, and so calling wi::only_sign_bit_p
>>> ICEs.  We've been seeing several similar bugs, where code calls
>>> integer_zerop and therefore assumes that they have an INTEGER_CST, but
>>> in fact integer_zerop does STRIP_NOPS.
>>>
>>> This patch changes the pattern to only match if the operand is actually
>>>
>>> an INTEGER_CST.  Alternatively we could call tree_strip_nop_conversions
>>>
>>> on the operand, but I would expect that to have issues when the
>>> conversion changes the signedness of the type.
>>>
>>> OK if testing passes?
>>
>>
>> What happens if we remove the nops stripping from integer_zerop?
>
>
> I had the same reaction.
>
>> Do other integer predicates strip nops?
>
>
> Yes, they do.
>
> I believe I added one or two of those, and the reason I added STRIP_NOPS is
> because they started as a copy-paste of integer_zerop...

Ok...

Jason, from looking at the PRs backtrace I see the C++ FE does things like

  if (complain & tf_warning)
warn_logical_operator (loc, code, boolean_type_node,
   code_orig_arg1, fold (arg1),
   code_orig_arg2, fold (arg2));

but that's in principle a no-no, if arg1s operands are not folded.
Delayed folding needs
to happen recursively, bottom-up.  Folders generally do not expect
unfolded operands
like (int) 1.

There is c-common.c:c_fully_fold () which does this properly but with

  /* This function is not relevant to C++ because C++ folds while
 parsing, and may need changes to be correct for C++ when C++
 stops folding while parsing.  */
  if (c_dialect_cxx ())
gcc_unreachable ();

not sure if the C++ FE can re-use this for the diagnostic cases.

Richard.



> --
> Marc Glisse

Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-11-23 Thread Jakub Jelinek

On Mon, Nov 23, 2015 at 12:31:24PM +0100, Thomas Schwinge wrote:
> Hi Jakub!
> 
> On Fri, 17 Jul 2015 15:05:59 +0200, Jakub Jelinek  wrote:
> > [...] "omp declare target link" [...]
> 
> > This patch only marks them with the new attribute, [...]
> 
> > --- gcc/c/c-parser.c.jj 2015-07-16 18:09:25.0 +0200
> > +++ gcc/c/c-parser.c2015-07-17 14:11:08.553694975 +0200
> 
> >  static void
> >  c_parser_omp_declare_target (c_parser *parser)
> >  {
> > [...]
> > +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> > +{
> > +  tree t = OMP_CLAUSE_DECL (c), id;
> > +  tree at1 = lookup_attribute ("omp declare target", DECL_ATTRIBUTES 
> > (t));
> > +  tree at2 = lookup_attribute ("omp declare target link",
> > +  DECL_ATTRIBUTES (t));
> > +  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINK)
> > +   {
> > + id = get_identifier ("omp declare target link");
> > + std::swap (at1, at2);
> > +   }
> > +  else
> > +   id = get_identifier ("omp declare target");
> 
> Is it intentional that you didn't add "omp declare target link" to
> gcc/c-family/c-common.c:c_common_attribute_table, next to the existing
> "omp declare target"?

No.  But the link attribute support is still unfinished, Ilya is working on
the support.

Jakub

Re: [PATCH, 4/16] Implement -foffload-alias

2015-11-23 Thread Richard Biener

On Sat, 21 Nov 2015, Tom de Vries wrote:

> On 13/11/15 12:39, Jakub Jelinek wrote:
> > On Fri, Nov 13, 2015 at 12:29:51PM +0100, Richard Biener wrote:
> > > > thanks for the explanation. Filed as PR68331 - '[meta-bug] fipa-pta
> > > > issues'.
> > > > 
> > > > Any feedback on the '#pragma GCC offload-alias=' bit
> > > > above?
> > > > Is that sort of what you had in mind?
> > > 
> > > Yes.  Whether that makes sense is another question of course.  You can
> > > annotate memory references with MR_DEPENDENCE_BASE/CLIQUE yourself
> > > as well if you know dependences without the users intervention.
> > 
> > I really don't like even the GCC offload-alias, I just don't see anything
> > special on the offload code.  Not to mention that the same issue is already
> > with other outlined functions, like OpenMP tasks or parallel regions, those
> > aren't offloaded, yet they can suffer from worse alias/points-to analysis
> > too.
> 
> AFAIU there is one aspect that is different for offloaded code: the setup of
> the data on the device.
> 
> Consider this example:
> ...
> unsigned int a[N];
> unsigned int b[N];
> unsigned int c[N];
> 
> int
> main (void)
> {
>   ...
> 
> #pragma acc kernels copyin (a) copyin (b) copyout (c)
>   {
> for (COUNTERTYPE ii = 0; ii < N; ii++)
>   c[ii] = a[ii] + b[ii];
>   }
> 
>   ...
> ...
> 
> At gimple level, we have:
> ...
> #pragma omp target oacc_kernels \
>   map(force_from:c [len: 2097152]) \
>   map(force_to:b [len: 2097152]) \
>   map(force_to:a [len: 2097152])
> ...
> 
> [ The meaning of the force_from/force_to mappings is given in
> include/gomp-constants.h:
> ...
> /* Allocate.  */
> GOMP_MAP_FORCE_ALLOC = (GOMP_MAP_FLAG_FORCE | GOMP_MAP_ALLOC),
> /* ..., and copy to device.  */
> GOMP_MAP_FORCE_TO = (GOMP_MAP_FLAG_FORCE | GOMP_MAP_TO),
> /* ..., and copy from device.  */
> GOMP_MAP_FORCE_FROM = (GOMP_MAP_FLAG_FORCE | GOMP_MAP_FROM),
> /* ..., and copy to and from device.  */
> GOMP_MAP_FORCE_TOFROM = (GOMP_MAP_FLAG_FORCE | GOMP_MAP_TOFROM),
> ...  ]
> 
> So before calling the offloaded function, a separate alloc is done for a, b
> and c, and the base pointers of the newly allocated objects are passed to the
> offloaded function.
> 
> This means we can mark those base pointers as restrict in the offloaded
> function.
> 
> Attached proof-of-concept patch implements that.
> 
> > We simply have some compiler internal interface between the caller and
> > callee of the outlined regions, each interface in between those has
> > its own structure type used to communicate the info;
> > we can attach attributes on the fields, or some flags to indicate some
> > properties interesting from aliasing POV.
> > We don't really need to perform
> > full IPA-PTA, perhaps it would be enough to a) record somewhere in cgraph
> > the relationship in between such callers and callees (for offloading regions
> > we already have "omp target entrypoint" attribute on the callee and a
> > singler caller), tell LTO if possible not to split those into different
> > partitions if easily possible, and then just for these pairs perform
> > aliasing/points-to analysis in the caller and the result record using
> > cliques/special attributes/whatever to the callee side, so that the callee
> > (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias analysis.
> 
> As a start, is the approach of this patch OK?

Works for me but leaving to Jakub to review for correctness.

Richard.

> It will allow us to commit the oacc kernels patch series with the ability to
> parallelize non-trivial testcases, and work on improving the alias bit after
> that.
> 
> Thanks,
> - Tom
> 
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[Bug tree-optimization/68465] pass_lim doesn't detect identical loop entry conditions

2015-11-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68465

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #1 from Richard Biener  ---
I don't think that's LIMs job then.  Iterating LIM won't help here w/o
intermediate optimization.

[Bug tree-optimization/65178] incorrect -Wmaybe-uninitialized when using nested loops

2015-11-23 Thread winter-...@bfw-online.de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65178

Leon Winter  changed:

   What|Removed |Added

Version|5.0 |5.2.1

--- Comment #4 from Leon Winter  ---
Bug still persists.

Re: [PATCH v2] Add uaddv_optab, usubv4_optab

2015-11-23 Thread Richard Henderson


On 11/22/2015 05:57 PM, Segher Boessenkool wrote:

Hi Richard,

On Sun, Nov 22, 2015 at 11:38:31AM +0100, Richard Henderson wrote:

One of which I believe I've worked around in the i386 backend, but I
believe to be a latent problem within combine.

With the following patch, disable the add3_*_overflow_2 patterns.
Then compile c-c++-common/torture/builtin-arith-overflow-4.c with -O2 and
you'll see

  t151_2add:
testb   %dil, %dil
leal-1(%rdi), %eax
jne .L644



0xff + x < 0xff  (everything as unsigned char) is the same as  x != 0 .


You'd think yes.  But certainly something right there triggered the abort that 
fails the test case.  Perhaps I simply mis-identified the error, but the "fix" 
for this fixed the other as well.



r~

[Bug tree-optimization/68326] ICE at -O3 on x86_64-linux-gnu in set_value_range, at tree-vrp.c:380

2015-11-23 Thread jiwang at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68326

--- Comment #2 from Jiong Wang  ---
Author: jiwang
Date: Mon Nov 23 12:14:05 2015
New Revision: 230754

URL: https://gcc.gnu.org/viewcvs?rev=230754=gcc=rev
Log:
[Patch] Drop constant overflow flag in adjust_range_with_scev when possible

2015-11-23  Richard Biener  
Jiong Wang  

gcc/
  PR tree-optimization/68317
  PR tree-optimization/68326
  * tree-vrp.c (adjust_range_with_scev): Call drop_tree_overflow if the
  final min and max are not infinity.

gcc/testsuite/
  * gcc.dg/pr68317.c: New testcase.


Added:
trunk/gcc/testsuite/gcc.dg/pr68317.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c

[Bug tree-optimization/68317] [6 regression] ice in set_value_range, at tree-vrp.c:380

2015-11-23 Thread jiwang at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68317

Jiong Wang  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Jiong Wang  ---
Mark as fixed.

[Bug tree-optimization/68317] [6 regression] ice in set_value_range, at tree-vrp.c:380

2015-11-23 Thread jiwang at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68317

--- Comment #11 from Jiong Wang  ---
Author: jiwang
Date: Mon Nov 23 12:14:05 2015
New Revision: 230754

URL: https://gcc.gnu.org/viewcvs?rev=230754=gcc=rev
Log:
[Patch] Drop constant overflow flag in adjust_range_with_scev when possible

2015-11-23  Richard Biener  
Jiong Wang  

gcc/
  PR tree-optimization/68317
  PR tree-optimization/68326
  * tree-vrp.c (adjust_range_with_scev): Call drop_tree_overflow if the
  final min and max are not infinity.

gcc/testsuite/
  * gcc.dg/pr68317.c: New testcase.


Added:
trunk/gcc/testsuite/gcc.dg/pr68317.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vrp.c

[Bug target/68497] ICE: in output_387_binary_op, at config/i386/i386.c:17689 with -fno-checking

2015-11-23 Thread miyuki at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68497

Mikhail Maltsev  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-11-23
 CC||miyuki at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |miyuki at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Mikhail Maltsev  ---
Confirmed. I have a patch and I'll post it shortly (after regtest).

[Bug c/68499] New: Unclear STDC FP_CONTRACT behavior in non-standard modes

2015-11-23 Thread vincent-gcc at vinc17 dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68499

Bug ID: 68499
   Summary: Unclear STDC FP_CONTRACT behavior in non-standard
modes
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincent-gcc at vinc17 dot net
  Target Milestone: ---

Created attachment 36807
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36807=edit
example of C program based on the STDC FP_CONTRACT pragma

The following applies to:
gcc (Debian 20151030-1) 6.0.0 20151031 (experimental) [trunk revision 229615]

When I compile a C program like the attached one with

#pragma STDC FP_CONTRACT OFF

in a standard mode, the pragma is now taken into account as expected (see
PR37845 / r204460). However, in a non-standard mode such as the default gnu99
(?), it is not taken into account (a FMA is generated for an operation like
x*y+z), but one gets no warnings either. One should have one of the following
behaviors in non-standard modes:

1. The pragma is taken into account.

2. The pragma is not taken into account, but one gets a warning (making it
unknown in non-standard modes is OK).

If possible, (1) is probably the best choice, at least in gnu99 and gnu11
modes, as the user may want to disable contraction for some floating-point
algorithms while still being able to use specific GNU extensions.

Re: [AArch64][dejagnu][PATCH 5/7] Dejagnu support for ARMv8.1 Adv.SIMD.

2015-11-23 Thread James Greenhalgh

On Tue, Oct 27, 2015 at 03:32:04PM +, Matthew Wahab wrote:
> On 24/10/15 08:16, Bernhard Reutner-Fischer wrote:
> >On October 23, 2015 2:24:26 PM GMT+02:00, Matthew Wahab 
> > wrote:
> >>The ARMv8.1 architecture extension adds two Adv.SIMD instructions,.
> >>This
> >>patch adds support in Dejagnu for ARMv8.1 Adv.SIMD specifiers and
> >>checks.
> >>
> >>The new test options are
> >>- { dg-add-options arm_v8_1a_neon }: Add compiler options needed to
> >>   enable ARMv8.1 Adv.SIMD.
> >>- { dg-require-effective-target arm_v8_1a_neon_hw }: Require a target
> >>   capable of executing ARMv8.1 Adv.SIMD instructions.
> >>
> >
> >Please error with something more meaningful than FOO, !__ARM_FEATURE_QRDMX 
> >comes to mind.
> >
> >TIA,
> >
> 
> I've reworked the patch so that the error is "__ARM_FEATURE_QRDMX not
> defined" and also strengthened the check_effective_target tests.
> 
> Retested for aarch64-none-elf with cross-compiled check-gcc on an
> ARMv8.1 emulator. Also tested with a version of the compiler that
> doesn't define the ACLE feature macro.

Hi Matthew,

I have a couple of comments below. Neither need to block the patch, but
I'd appreciate a reply before I say OK.

> From b12969882298cb79737e882c48398c58a45161b9 Mon Sep 17 00:00:00 2001
> From: Matthew Wahab 
> Date: Mon, 26 Oct 2015 14:58:36 +
> Subject: [PATCH 5/7] [Testsuite] Add dejagnu options for armv8.1 neon
> 
> Change-Id: Ib58b8c4930ad3971af3ea682eda043e14cd2e8b3
> ---
>  gcc/testsuite/lib/target-supports.exp | 56 
> ++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 4d5b0a3d..0fb679d 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -2700,6 +2700,16 @@ proc add_options_for_arm_v8_neon { flags } {
>  return "$flags $et_arm_v8_neon_flags -march=armv8-a"
>  }
>  
> +# Add the options needed for ARMv8.1 Adv.SIMD.
> +
> +proc add_options_for_arm_v8_1a_neon { flags } {
> +if { [istarget aarch64*-*-*] } {
> + return "$flags -march=armv8.1-a"

Should this be -march=armv8.1-a+simd or some other feature flag?

> +} else {
> + return "$flags"
> +}
> +}
> +
>  proc add_options_for_arm_crc { flags } {
>  if { ! [check_effective_target_arm_crc_ok] } {
>  return "$flags"
> @@ -2984,7 +2994,8 @@ foreach { armfunc armflag armdef } { v4 "-march=armv4 
> -marm" __ARM_ARCH_4__
>v7r "-march=armv7-r" __ARM_ARCH_7R__
>v7m "-march=armv7-m -mthumb" 
> __ARM_ARCH_7M__
>v7em "-march=armv7e-m -mthumb" 
> __ARM_ARCH_7EM__
> -  v8a "-march=armv8-a" __ARM_ARCH_8A__ } {
> +  v8a "-march=armv8-a" __ARM_ARCH_8A__
> +  v8_1a "-march=armv8.1a" __ARM_ARCH_8A__ } {
>  eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
>   proc check_effective_target_arm_arch_FUNC_ok { } {
>   if { [ string match "*-marm*" "FLAG" ] &&
> @@ -3141,6 +3152,25 @@ proc check_effective_target_arm_neonv2_hw { } {
>  } [add_options_for_arm_neonv2 ""]]
>  }
>  
> +# Return 1 if the target supports the ARMv8.1 Adv.SIMD extension, 0
> +# otherwise.  The test is valid for AArch64.
> +
> +proc check_effective_target_arm_v8_1a_neon_ok_nocache { } {
> +if { ![istarget aarch64*-*-*] } {
> + return 0
> +}
> +return [check_no_compiler_messages_nocache arm_v8_1a_neon_ok assembly {
> + #if !defined (__ARM_FEATURE_QRDMX)
> + #error "__ARM_FEATURE_QRDMX not defined"
> + #endif
> +} [add_options_for_arm_v8_1a_neon ""]]
> +}
> +
> +proc check_effective_target_arm_v8_1a_neon_ok { } {
> +return [check_cached_effective_target arm_v8_1a_neon_ok \
> + check_effective_target_arm_v8_1a_neon_ok_nocache]
> +}
> +
>  # Return 1 if the target supports executing ARMv8 NEON instructions, 0
>  # otherwise.
>  
> @@ -3159,6 +3189,30 @@ proc check_effective_target_arm_v8_neon_hw { } {
>  } [add_options_for_arm_v8_neon ""]]
>  }
>  
> +# Return 1 if the target supports executing the ARMv8.1 Adv.SIMD extension, 0
> +# otherwise.  The test is valid for AArch64.
> +
> +proc check_effective_target_arm_v8_1a_neon_hw { } {
> +if { ![check_effective_target_arm_v8_1a_neon_ok] } {
> + return 0;
> +}
> +return [check_runtime_nocache arm_v8_1a_neon_hw_available {
> + int
> + main (void)
> + {
> +   long long a = 0, b = 1;
> +   long long result = 0;
> +
> +   asm ("sqrdmlah %s0,%s1,%s2"
> +: "=w"(result)
> +: "w"(a), "w"(b)
> +: /* No clobbers.  */);

Hm, those types look wrong, I guess this works but it is an unusual way
to write it. I presume this is to avoid including arm_neon.h each time, but
you

[Bug c/68499] Unclear STDC FP_CONTRACT behavior in non-standard modes

2015-11-23 Thread vincent-gcc at vinc17 dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68499

--- Comment #1 from Vincent Lefèvre  ---
Well, actually the pragma is ignored in all cases. The fix was to set the
default to OFF in the standard modes. So, currently, one should get a warning
in non-standard modes.

[Bug other/68500] New: Remove in_loop_pipeline usage

2015-11-23 Thread vries at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68500

Bug ID: 68500
   Summary: Remove in_loop_pipeline usage
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Consider pass_slp_vectorize::execute
...
pass_slp_vectorize::execute (function *fun)
{
  basic_block bb;

  bool in_loop_pipeline = scev_initialized_p ();
  if (!in_loop_pipeline)
{
  loop_optimizer_init (LOOPS_NORMAL);
  scev_initialize ();
}

  ...

  if (!in_loop_pipeline)
{
  scev_finalize ();
  loop_optimizer_finalize ();
} 
...

It uses an in_loop_pipeline variable, initialized using scev_initialized_p to
detect whether the pass is run in the loop pipeline, to allow different
behaviour inside and outside the loop pipeline.

We want a cleaner way to allow passes to run correctly inside and outside the
loop pipeline.

[gomp4] Merge trunk r230274 (2015-11-12) into gomp-4_0-branch

2015-11-23 Thread Thomas Schwinge

Hi!

Committed to gomp-4_0-branch in r230749:

commit 4002b8b54d3e1e9ac049446339fc02e3fd192f43
Merge: 018ba48 5902f28
Author: tschwinge 
Date:   Mon Nov 23 10:41:31 2015 +

svn merge -r 230255:230274 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@230749 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas


signature.asc
Description: PGP signature

1 2 3 4 >

1 - 100 of 368 matches

Mail list logo