date:20140109

Re: RFC Asan instrumentation control

2014-01-09 Thread Jakub Jelinek

On Fri, Jan 10, 2014 at 11:49:43AM +0400, Maxim Ostapenko wrote:
> 2014-01-10  Max Ostapenko  
> 
>   * c-c++-common/asan/no-asan-stack.c: New test.

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/asan/no-asan-stack.c
> @@ -0,0 +1,17 @@
> +/* { dg-do assemble { target { x86_64-unknown-linux-gnu } } } */
> +/* { dg-options "-save-temps --param asan-stack=0" } */

If you want to limit to x86_64-linux only, please do:
target { { i?86-*-linux* x86_64-*-linux* } && lp64 }
instead.  Also, what advantages do you see for trying to assemble
the result?  If you instead just do dg-do compile, you can drop -save-temps
from dg-options and /* { dg-final { cleanup-saved-temps } } */.

> +#include 
> +
> +volatile int one = 1;
> +
> +int
> +main ()
> +{
> +  volatile char a1[] = {one, 2, 3, 4};
> +  volatile char a2[] = {1, 2*one, 3, 4};
> +  volatile int res = memcmp ((void *)a1,(void *)a2, 5 + one);
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-not "0x41b58ab3|0x41B58AB3|1102416563" } } */
> +/* { dg-final { cleanup-saved-temps } } */


Jakub

Re: RFC Asan instrumentation control

2014-01-09 Thread Maxim Ostapenko


Hi!

>>> * c-c++-common/asan/no-asan-stack.c (this triggers read overflow
>>> because we haven't found a cross-platform way to grep for stack
>>> redzones instrumentation)
>>
>> I'd prefer no test in that case, or just some semi-platform specific 
test
>> (scan that the 0x41b58ab3 constant doesn't appear in say some late 
RTL dump,
>> or perhaps just assembly (just scan it with lower and upper case and 
decimal

>> too)).
>
> Thanks, commited in 206458 without c-c++-common/asan/no-asan-stack.c 
testfile.

> I'll fix this test according to your recommendations a bit later.

I've fixed the c-c++-common/asan/no-asan-stack.c testfile. Tested on
x86_64-unknown-linux-gnu.

Ok to commit?

-Maxim.
2014-01-10  Max Ostapenko  

	* c-c++-common/asan/no-asan-stack.c: New test.

diff --git a/gcc/testsuite/c-c++-common/asan/no-asan-stack.c b/gcc/testsuite/c-c++-common/asan/no-asan-stack.c
new file mode 100644
index 000..d81b834
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/no-asan-stack.c
@@ -0,0 +1,17 @@
+/* { dg-do assemble { target { x86_64-unknown-linux-gnu } } } */
+/* { dg-options "-save-temps --param asan-stack=0" } */
+#include 
+
+volatile int one = 1;
+
+int
+main ()
+{
+  volatile char a1[] = {one, 2, 3, 4};
+  volatile char a2[] = {1, 2*one, 3, 4};
+  volatile int res = memcmp ((void *)a1,(void *)a2, 5 + one);
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "0x41b58ab3|0x41B58AB3|1102416563" } } */
+/* { dg-final { cleanup-saved-temps } } */

Go patch committed: Use backend interface for slice info

2014-01-09 Thread Ian Lance Taylor

This patch from Chris Manghane changes gccgo to use the backend
interface for slice info.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 44fc257ad7f2 go/expressions.cc
--- a/go/expressions.cc	Thu Jan 09 15:25:44 2014 -0800
+++ b/go/expressions.cc	Thu Jan 09 22:31:00 2014 -0800
@@ -3060,6 +3060,9 @@
   Expression*
   do_lower(Gogo*, Named_object*, Statement_inserter*, int);
 
+  Expression*
+  do_flatten(Gogo*, Named_object*, Statement_inserter*);
+
   bool
   do_is_constant() const;
 
@@ -3203,6 +3206,25 @@
   return this;
 }
 
+// Flatten a type conversion by using a temporary variable for the slice
+// in slice to string conversions.
+
+Expression*
+Type_conversion_expression::do_flatten(Gogo*, Named_object*,
+   Statement_inserter* inserter)
+{
+  if (this->type()->is_string_type()
+  && this->expr_->type()->is_slice_type()
+  && !this->expr_->is_variable())
+{
+  Temporary_statement* temp =
+  Statement::make_temporary(NULL, this->expr_, this->location());
+  inserter->insert(temp);
+  this->expr_ = Expression::make_temporary_reference(temp, this->location());
+}
+  return this;
+}
+
 // Return whether a type conversion is a constant.
 
 bool
@@ -3361,47 +3383,24 @@
 }
   else if (type->is_string_type() && expr_type->is_slice_type())
 {
-  if (!DECL_P(expr_tree))
-	expr_tree = save_expr(expr_tree);
-
-  Type* int_type = Type::lookup_integer_type("int");
-  tree int_type_tree = type_to_tree(int_type->get_backend(gogo));
-
+  Location location = this->location();
   Array_type* a = expr_type->array_type();
   Type* e = a->element_type()->forwarded();
   go_assert(e->integer_type() != NULL);
-  tree valptr = fold_convert(const_ptr_type_node,
- a->value_pointer_tree(gogo, expr_tree));
-  tree len = a->length_tree(gogo, expr_tree);
-  len = fold_convert_loc(this->location().gcc_location(), int_type_tree,
- len);
+  go_assert(this->expr_->is_variable());
+
+  Runtime::Function code;
   if (e->integer_type()->is_byte())
-	{
-	  static tree byte_array_to_string_fndecl;
-	  ret = Gogo::call_builtin(&byte_array_to_string_fndecl,
-   this->location(),
-   "__go_byte_array_to_string",
-   2,
-   type_tree,
-   const_ptr_type_node,
-   valptr,
-   int_type_tree,
-   len);
-	}
+code = Runtime::BYTE_ARRAY_TO_STRING;
   else
-	{
-	  go_assert(e->integer_type()->is_rune());
-	  static tree int_array_to_string_fndecl;
-	  ret = Gogo::call_builtin(&int_array_to_string_fndecl,
-   this->location(),
-   "__go_int_array_to_string",
-   2,
-   type_tree,
-   const_ptr_type_node,
-   valptr,
-   int_type_tree,
-   len);
-	}
+{
+  go_assert(e->integer_type()->is_rune());
+  code = Runtime::INT_ARRAY_TO_STRING;
+}
+  Expression* valptr = a->get_value_pointer(gogo, this->expr_);
+  Expression* len = a->get_length(gogo, this->expr_);
+  Expression* a2s_expr = Runtime::make_call(code, location, 2, valptr, len);
+  ret = a2s_expr->get_tree(context);
 }
   else if (type->is_slice_type() && expr_type->is_string_type())
 {
@@ -6595,6 +6594,7 @@
 {
   std::swap(left_type, right_type);
   std::swap(left_tree, right_tree);
+  std::swap(left_expr, right_expr);
 }
 
   if (right_type->is_nil_type())
@@ -6603,7 +6603,8 @@
 	  && left_type->array_type()->length() == NULL)
 	{
 	  Array_type* at = left_type->array_type();
-	  left_tree = at->value_pointer_tree(context->gogo(), left_tree);
+  left_expr = at->get_value_pointer(context->gogo(), left_expr);
+  left_tree = left_expr->get_tree(context);
 	  right_tree = fold_convert(TREE_TYPE(left_tree), null_pointer_node);
 	}
   else if (left_type->interface_type() != NULL)
@@ -7037,6 +7038,9 @@
   Expression*
   do_lower(Gogo*, Named_object*, Statement_inserter*, int);
 
+  Expression*
+  do_flatten(Gogo*, Named_object*, Statement_inserter*);
+
   bool
   do_is_constant() const;
 
@@ -7367,6 +7371,36 @@
   return this;
 }
 
+// Flatten a builtin call expression.  This turns the arguments of copy and
+// append into temporary expressions.
+
+Expression*
+Builtin_call_expression::do_flatten(Gogo*, Named_object*,
+Statement_inserter* inserter)
+{
+  if (this->code_ == BUILTIN_APPEND
+  || this->code_ == BUILTIN_COPY)
+{
+  Location loc = this->location();
+  Type* at = this->args()->front()->type();
+  for (Expression_list::iterator pa = this->args()->begin();
+   pa != this->args()->end();
+   ++pa)
+{
+  if ((*pa)->is_nil_expression())
+*pa = Expression::make_slice_composite_literal(at, NULL, loc);
+  if (!(*pa)->is_variable())
+{
+  Temporary_statement* temp =
+  Sta

[PATCH] MIPS: improve Loongson-2E/2F/3A detection for -march=native

2014-01-09 Thread Huacai Chen

Hi all,

For human-readability, I submit a patch to Linux kernel to display
Loongson-2E/2F/3A in /proc/cpuinfo. But that break -march=native, this
patch fix that.

Regards,

Huacai

2014-01-10  Huacai Chen  

* config/mips/driver-native.c: improve Loongson-2E/2F/3A detection

---
 gcc/config/mips/driver-native.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config/mips/driver-native.c b/gcc/config/mips/driver-native.c
index 3f1a8d0..63848d2 100644
--- a/gcc/config/mips/driver-native.c
+++ b/gcc/config/mips/driver-native.c
@@ -58,11 +58,17 @@ host_detect_local_cpu (int argc, const char **argv)
 if (strncmp (buf, "cpu model", sizeof ("cpu model") - 1) == 0)
   {
if (strstr (buf, "Godson2 V0.2") != NULL
-   || strstr (buf, "Loongson-2 V0.2") != NULL)
+|| strstr (buf, "Loongson-2 V0.2") != NULL
+|| strstr (buf, "Loongson-2E") != NULL)
  cpu = "loongson2e";
else if (strstr (buf, "Godson2 V0.3") != NULL
-|| strstr (buf, "Loongson-2 V0.3") != NULL)
+|| strstr (buf, "Loongson-2 V0.3") != NULL
+|| strstr (buf, "Loongson-2F") != NULL)
  cpu = "loongson2f";
+   else if (strstr (buf, "Godson3 V0.5") != NULL
+|| strstr (buf, "Loongson-3 V0.5") != NULL
+|| strstr (buf, "Loongson-3A") != NULL)
+ cpu = "loongson3a";
else if (strstr (buf, "SiByte SB1") != NULL)
  cpu = "sb1";
else if (strstr (buf, "R5000") != NULL)
-- 
1.8.5.2

Re: [gofrontend-dev] libgo patch committed: Fix 32-bit memory allocation

2014-01-09 Thread Ian Lance Taylor

On Thu, Jan 9, 2014 at 6:34 PM, Michael Hudson-Doyle
 wrote:
>
> Ian Lance Taylor  writes:
>
>> This patch to libgo fixes memory allocation on 32-bit systems when a lot
>> of memory has been allocated.  The problem is described in this patch to
>> the master repository: https://codereview.appspot.com/49460043 .
>
> Here's a patch for the 4.8 branch if you are interested.  I haven't
> tested it yet -- well, it's in progress but I'm not going to hang around
> long enough for it to finish today.

Thanks.  Committed to 4.8 branch after testing.

Ian

Re: Improving mklog [was: Re: RFC Asan instrumentation control]

2014-01-09 Thread Yury Gribov


> I hacked a simple addition to mklog which skips unchanged functions
> in diff-log while adding function names to the final ChangeLog.
>
> New mklog results were verified by testsuite which compares reference
> ChangeLogs of patches from gcc trunk with logs generated by mklog.
>
> Patched mklog considerably reduced the number of unchanged functions
> in ChangeLog.

This patch indeed dramatically reduces amount of false reports for 
real-world patches.


-Y

Re: [PATCH] libsanitizer demangling using cp-demangle.c

2014-01-09 Thread Konstantin Serebryany

On Thu, Jan 9, 2014 at 5:57 PM, Jakub Jelinek  wrote:
> On Thu, Jan 09, 2014 at 05:51:05PM +0400, Konstantin Serebryany wrote:
>> On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek  wrote:
>> > On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote:
>> >> There was a recent buggy patch to the demangler that added calls to
>> >> malloc and realloc (2013-10-25 Gary Benson ).
>> >> That patch must be fixed or reverted before the 4.9 release.  The main
>> >> code in the demangler must not call malloc/realloc.
>> >>
>> >> When that patch is fixed, you can use the cplus_demangle_v3_callback
>> >> function to get a demangler that never calls malloc.
>> >
>> > AFAIK Gary is working on a fix, when that is fixed, with the following
>> > patch libsanitizer (when using libbacktrace for symbolization) will not
>> > use system malloc/realloc/free for the demangling at all.
>> >
>> > Tested on x86_64-linux (-m64/-m32).  Note that the changes for the 3 files
>> > unfortunately will need to be applied upstream to compiler-rt, is that
>> > possible?
>> >
>> > 2013-12-10  Jakub Jelinek  
>> >
>> > * sanitizer_common/sanitizer_symbolizer_libbacktrace.h
>> > (LibbacktraceSymbolizer::Demangle): New declaration.
>> > * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
>>
>> sanitizer_symbolizer_posix_libcdep.cc is the file from upstream.
>> If it gets any change in the GCC variant, I will not be able to do
>> merges from upstream until the same code is applied upstream.
>
> Sure, but we are nearing GCC 4.9 stage3 finish and really need to demangle
> the libbacktrace provided output.  Has the compiler-rt situation been
> cleared up?

I hope it just did (see the fresh Chandler's reply).

--kcc

> Haven't seen any follow-ups after Chandler's reversion.
> So, this change is meant to be temporary, with hope that in upstream this
> will be resolved, either with the same patch or something similar.
>
> Jakub

RE: [PATCH] Add zero-overhead looping for xtensa backend

2014-01-09 Thread Yangfei (Felix)

And here is the xtensa configuration tested (include/xtensa-config.h): 

#define XCHAL_HAVE_BE   0
#define XCHAL_HAVE_LOOPS1


> 
> Hi Sterling,
> 
> Please note that version 2 of the patch is for gcc trunk, not for
> gcc-4.8 branch.
> Since the doloop_end pattern format has changed, this patch need small
> adaptation in order for it to work on gcc-4.8.
> Although I test it  on gcc-4.8, I think the testing result still holds for
> trunk.
> Cheers,
> Felix
> 
> 
> On Thu, Jan 9, 2014 at 11:08 PM, Felix Yang  wrote:
> > Hi Sterling,
> >
> > Attached please find version 2 of the patch.
> >
> > I applied this updated patch (with small adaptations) to gcc-4.8.2
> > and carried out some tests.
> > I can execute the testcases in a simulator, which support
> > zero-overhead looping instructions.
> >
> > First of all, I can successfully build libgcc, libstdc++ and
> > newlibc for xtensa with this patch.
> > The newly built xtensa gcc also passed testsuite which comes with
> newlibc.
> > I also tested the cases under gcc/testsuite/gcc.c-torture/execute/
> > directory. There are about 800+ cases tested.
> > Test result shows no new failed case with this patch, compared
> > with the original gcc version.
> > Is that OK?
> >
> > I also double checked the loop relaxation issue with binutils-2.24
> > (the latest version).
> > The result show that the assember can do loop relaxation when the
> > loop target is too far ( > 256 Byte).
> > And this is the reason why I don't check the size of the loop.
> >
> >
> > Index: gcc/ChangeLog
> >
> 
> ===
> > --- gcc/ChangeLog(revision 206463)
> > +++ gcc/ChangeLog(working copy)
> > @@ -1,3 +1,18 @@
> > +2014-01-09  Felix Yang  
> > +
> > +* config/xtensa/xtensa.c (xtensa_reorg): New.
> > +(xtensa_reorg_loops): New.
> > +(xtensa_can_use_doloop_p): New.
> > +(xtensa_invalid_within_doloop): New.
> > +(hwloop_optimize): New.
> > +(hwloop_fail): New.
> > +(hwloop_pattern_reg): New.
> > +(xtensa_emit_loop_end): Modified to emit the zero-overhead loop end
> label.
> > +(xtensa_doloop_hooks): Define.
> > +* config/xtensa/xtensa.md (doloop_end): New.
> > +(zero_cost_loop_start): Rewritten.
> > +(zero_cost_loop_end): Rewritten.
> > +
> >  2014-01-09  Richard Biener  
> >
> >  PR tree-optimization/59715
> > Index: gcc/config/xtensa/xtensa.md
> >
> 
> ===
> > --- gcc/config/xtensa/xtensa.md(revision 206463)
> > +++ gcc/config/xtensa/xtensa.md(working copy)
> > @@ -1,6 +1,7 @@
> >  ;; GCC machine description for Tensilica's Xtensa architecture.
> >  ;; Copyright (C) 2001-2014 Free Software Foundation, Inc.
> >  ;; Contributed by Bob Wilson (bwil...@tensilica.com) at Tensilica.
> > +;; Zero-overhead looping support by Felix Yang (fei.yang0...@gmail.com).
> >
> >  ;; This file is part of GCC.
> >
> > @@ -35,6 +36,8 @@
> >(UNSPEC_TLS_CALL9)
> >(UNSPEC_TP10)
> >(UNSPEC_MEMW11)
> > +  (UNSPEC_LSETUP_START  12)
> > +  (UNSPEC_LSETUP_END13)
> >
> >(UNSPECV_SET_FP1)
> >(UNSPECV_ENTRY2)
> > @@ -1289,41 +1292,67 @@
> > (set_attr "length""3")])
> >
> >
> > +;; Hardware loop support.
> > +
> >  ;; Define the loop insns used by bct optimization to represent the
> > -;; start and end of a zero-overhead loop (in loop.c).  This start -;;
> > template generates the loop insn; the end template doesn't generate
> > -;; any instructions since loop end is handled in hardware.
> > +;; start and end of a zero-overhead loop.  This start template
> > +generates ;; the loop insn; the end template doesn't generate any
> > +instructions since ;; loop end is handled in hardware.
> >
> >  (define_insn "zero_cost_loop_start"
> >[(set (pc)
> > -(if_then_else (eq (match_operand:SI 0 "register_operand" "a")
> > -  (const_int 0))
> > -  (label_ref (match_operand 1 "" ""))
> > -  (pc)))
> > -   (set (reg:SI 19)
> > -(plus:SI (match_dup 0) (const_int -1)))]
> > +(if_then_else (ne (match_operand:SI 0 "register_operand" "a")
> > +  (const_int 1))
> > +  (label_ref (match_operand 1 "" ""))
> > +  (pc)))
> > +   (set (match_operand:SI 2 "register_operand" "+a0")
> > +(plus (match_dup 2)
> > +  (const_int -1)))
> > +   (unspec [(const_int 0)] UNSPEC_LSETUP_START)]
> >""
> > -  "loopnez\t%0, %l1"
> > +  "loop\t%0, %l1_LEND"
> >[(set_attr "type""jump")
> > (set_attr "mode""none")
> > (set_attr "length""3")])
> >
> >  (define_insn "zero_cost_loop_end"
> >[(set (pc)
> > -(if_then_else (ne (reg:SI 19) (const_int 0))
> > -  (label_ref (match_operand 0 "" ""))
> > -  (pc)))
> > -   (set (reg:SI 19)
> > -(plus:SI (

Re: [gofrontend-dev] libgo patch committed: Fix 32-bit memory allocation

2014-01-09 Thread Michael Hudson-Doyle


Ian Lance Taylor  writes:

> This patch to libgo fixes memory allocation on 32-bit systems when a lot
> of memory has been allocated.  The problem is described in this patch to
> the master repository: https://codereview.appspot.com/49460043 .

Here's a patch for the 4.8 branch if you are interested.  I haven't
tested it yet -- well, it's in progress but I'm not going to hang around
long enough for it to finish today.

Cheers,
mwh

diff --git a/libgo/runtime/malloc.goc b/libgo/runtime/malloc.goc
index 8ccaa6b..f0871dd 100644
--- a/libgo/runtime/malloc.goc
+++ b/libgo/runtime/malloc.goc
@@ -541,8 +541,7 @@ runtime_settype_flush(M *mp, bool sysalloc)
 
 		// (Manually inlined copy of runtime_MHeap_Lookup)
 		p = (uintptr)v>>PageShift;
-		if(sizeof(void*) == 8)
-			p -= (uintptr)runtime_mheap->arena_start >> PageShift;
+		p -= (uintptr)runtime_mheap->arena_start >> PageShift;
 		s = runtime_mheap->map[p];
 
 		if(s->sizeclass == 0) {
diff --git a/libgo/runtime/mgc0.c b/libgo/runtime/mgc0.c
index c3b3211..9f17bdc 100644
--- a/libgo/runtime/mgc0.c
+++ b/libgo/runtime/mgc0.c
@@ -239,8 +239,7 @@ markonly(void *obj)
 	// (Manually inlined copy of MHeap_LookupMaybe.)
 	k = (uintptr)obj>>PageShift;
 	x = k;
-	if(sizeof(void*) == 8)
-		x -= (uintptr)runtime_mheap->arena_start>>PageShift;
+	x -= (uintptr)runtime_mheap->arena_start>>PageShift;
 	s = runtime_mheap->map[x];
 	if(s == nil || k < s->start || k - s->start >= s->npages || s->state != MSpanInUse)
 		return false;
@@ -418,8 +417,7 @@ flushptrbuf(PtrTarget *ptrbuf, PtrTarget **ptrbufpos, Obj **_wp, Workbuf **_wbuf
 			// (Manually inlined copy of MHeap_LookupMaybe.)
 			k = (uintptr)obj>>PageShift;
 			x = k;
-			if(sizeof(void*) == 8)
-x -= (uintptr)arena_start>>PageShift;
+			x -= (uintptr)arena_start>>PageShift;
 			s = runtime_mheap->map[x];
 			if(s == nil || k < s->start || k - s->start >= s->npages || s->state != MSpanInUse)
 continue;
@@ -466,8 +464,7 @@ flushptrbuf(PtrTarget *ptrbuf, PtrTarget **ptrbufpos, Obj **_wp, Workbuf **_wbuf
 			// Ask span about size class.
 			// (Manually inlined copy of MHeap_Lookup.)
 			x = (uintptr)obj >> PageShift;
-			if(sizeof(void*) == 8)
-x -= (uintptr)arena_start>>PageShift;
+			x -= (uintptr)arena_start>>PageShift;
 			s = runtime_mheap->map[x];
 
 			PREFETCH(obj);
@@ -585,8 +582,7 @@ checkptr(void *obj, uintptr objti)
 	if(t == nil)
 		return;
 	x = (uintptr)obj >> PageShift;
-	if(sizeof(void*) == 8)
-		x -= (uintptr)(runtime_mheap->arena_start)>>PageShift;
+	x -= (uintptr)(runtime_mheap->arena_start)>>PageShift;
 	s = runtime_mheap->map[x];
 	objstart = (byte*)((uintptr)s->startstart + npage, s->npages - npage);
 		s->npages = npage;
 		p = t->start;
-		if(sizeof(void*) == 8)
-			p -= ((uintptr)h->arena_start>>PageShift);
+		p -= ((uintptr)h->arena_start>>PageShift);
 		if(p > 0)
 			h->map[p-1] = s;
 		h->map[p] = t;
@@ -169,8 +168,7 @@ HaveSpan:
 	s->elemsize = (sizeclass==0 ? s->npagesstart;
-	if(sizeof(void*) == 8)
-		p -= ((uintptr)h->arena_start>>PageShift);
+	p -= ((uintptr)h->arena_start>>PageShift);
 	for(n=0; nmap[p+n] = s;
 	return s;
@@ -241,8 +239,7 @@ MHeap_Grow(MHeap *h, uintptr npage)
 	mstats.mspan_sys = h->spanalloc.sys;
 	runtime_MSpan_Init(s, (uintptr)v>>PageShift, ask>>PageShift);
 	p = s->start;
-	if(sizeof(void*) == 8)
-		p -= ((uintptr)h->arena_start>>PageShift);
+	p -= ((uintptr)h->arena_start>>PageShift);
 	h->map[p] = s;
 	h->map[p + s->npages - 1] = s;
 	s->state = MSpanInUse;
@@ -259,8 +256,7 @@ runtime_MHeap_Lookup(MHeap *h, void *v)
 	uintptr p;
 	
 	p = (uintptr)v;
-	if(sizeof(void*) == 8)
-		p -= (uintptr)h->arena_start;
+	p -= (uintptr)h->arena_start;
 	return h->map[p >> PageShift];
 }
 
@@ -281,8 +277,7 @@ runtime_MHeap_LookupMaybe(MHeap *h, void *v)
 		return nil;
 	p = (uintptr)v>>PageShift;
 	q = p;
-	if(sizeof(void*) == 8)
-		q -= (uintptr)h->arena_start >> PageShift;
+	q -= (uintptr)h->arena_start >> PageShift;
 	s = h->map[q];
 	if(s == nil || p < s->start || p - s->start >= s->npages)
 		return nil;
@@ -332,8 +327,7 @@ MHeap_FreeLocked(MHeap *h, MSpan *s)
 
 	// Coalesce with earlier, later spans.
 	p = s->start;
-	if(sizeof(void*) == 8)
-		p -= (uintptr)h->arena_start >> PageShift;
+	p -= (uintptr)h->arena_start >> PageShift;
 	if(p > 0 && (t = h->map[p-1]) != nil && t->state != MSpanInUse) {
 		tp = (uintptr*)(t->start<

Re: [PATCH] Tiny predcom improvement (PR tree-optimization/59643)

2014-01-09 Thread H.J. Lu

On Tue, Dec 31, 2013 at 11:04 AM, Jakub Jelinek  wrote:
> Hi!
>
> As written in the PR, I've been looking why is llvm 3.[34] so much faster
> on Scimark2 SOR benchmark and the reason is that it's predictive commoning
> or whatever it uses doesn't give up on the inner loop, while our predcom
> unnecessarily gives up, because there are reads that could alias the write.
>
> This simple patch improves the benchmark by 42%.  We already ignore
> unsuitable dependencies for read/read, the patch extends that for unsuitable
> dependencies for read/write by just putting the read (and anything in it's
> component) into the bad component which is ignored.  pcom doesn't optimize
> away the writes and will keep the potentially aliasing reads unmodified as
> well.  Without the patch we'd merge the two components, and as
> !determine_offset between the two DRs, it would mean the whole merged
> component would be always unsuitable and thus ignored.  With the patch
> we'll hopefully have some other reads with known offset to the write
> and can optimize that, so the patch should always either handle what
> it did before or handle perhaps some more cases.
>
> The inner loop from the (public domain) benchmark is added in the two tests,
> one runtime test and one test looking whether pcom actually optimized it.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-12-31  Jakub Jelinek  
>
> PR tree-optimization/59643
> * tree-predcom.c (split_data_refs_to_components): If one dr is
> read and one write, determine_offset fails and the write isn't
> in the bad component, just put the read into the bad component.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59745

-- 
H.J.

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

2014-01-09 Thread Tom de Vries


On 09-01-14 22:10, Andi Kleen wrote:

Tom de Vries  writes:


Is this patch OK for stage1 (after proper retesting)?


Could you perhaps post the latest series first?

I don't think it made it to the mailing list.



Andi,

the current status is:
- toplevel of patch series:
  http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01255.html
- the approved version (nitpick aside) of the test-cases patch is here:
  http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00585.html
- the mips implementation of the hook (not a part of the original series,
  but necessary) is discussed here:
  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00570.html
- all of the above is accumulated here:
  
http://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/vries/fuse-caller-save
- the hook as such is discussed here:
  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00555.html.

Thanks,
- Tom


-Andi

Re: [PATCH] Add zero-overhead looping for xtensa backend

2014-01-09 Thread Felix Yang

Hi Sterling,

Please note that version 2 of the patch is for gcc trunk, not for
gcc-4.8 branch.
Since the doloop_end pattern format has changed, this patch need
small adaptation in order for it to work on gcc-4.8.
Although I test it  on gcc-4.8, I think the testing result still
holds for trunk.
Cheers,
Felix


On Thu, Jan 9, 2014 at 11:08 PM, Felix Yang  wrote:
> Hi Sterling,
>
> Attached please find version 2 of the patch.
>
> I applied this updated patch (with small adaptations) to gcc-4.8.2
> and carried out some tests.
> I can execute the testcases in a simulator, which support
> zero-overhead looping instructions.
>
> First of all, I can successfully build libgcc, libstdc++ and
> newlibc for xtensa with this patch.
> The newly built xtensa gcc also passed testsuite which comes with newlibc.
> I also tested the cases under gcc/testsuite/gcc.c-torture/execute/
> directory. There are about 800+ cases tested.
> Test result shows no new failed case with this patch, compared
> with the original gcc version.
> Is that OK?
>
> I also double checked the loop relaxation issue with binutils-2.24
> (the latest version).
> The result show that the assember can do loop relaxation when the
> loop target is too far ( > 256 Byte).
> And this is the reason why I don't check the size of the loop.
>
>
> Index: gcc/ChangeLog
> ===
> --- gcc/ChangeLog(revision 206463)
> +++ gcc/ChangeLog(working copy)
> @@ -1,3 +1,18 @@
> +2014-01-09  Felix Yang  
> +
> +* config/xtensa/xtensa.c (xtensa_reorg): New.
> +(xtensa_reorg_loops): New.
> +(xtensa_can_use_doloop_p): New.
> +(xtensa_invalid_within_doloop): New.
> +(hwloop_optimize): New.
> +(hwloop_fail): New.
> +(hwloop_pattern_reg): New.
> +(xtensa_emit_loop_end): Modified to emit the zero-overhead loop end 
> label.
> +(xtensa_doloop_hooks): Define.
> +* config/xtensa/xtensa.md (doloop_end): New.
> +(zero_cost_loop_start): Rewritten.
> +(zero_cost_loop_end): Rewritten.
> +
>  2014-01-09  Richard Biener  
>
>  PR tree-optimization/59715
> Index: gcc/config/xtensa/xtensa.md
> ===
> --- gcc/config/xtensa/xtensa.md(revision 206463)
> +++ gcc/config/xtensa/xtensa.md(working copy)
> @@ -1,6 +1,7 @@
>  ;; GCC machine description for Tensilica's Xtensa architecture.
>  ;; Copyright (C) 2001-2014 Free Software Foundation, Inc.
>  ;; Contributed by Bob Wilson (bwil...@tensilica.com) at Tensilica.
> +;; Zero-overhead looping support by Felix Yang (fei.yang0...@gmail.com).
>
>  ;; This file is part of GCC.
>
> @@ -35,6 +36,8 @@
>(UNSPEC_TLS_CALL9)
>(UNSPEC_TP10)
>(UNSPEC_MEMW11)
> +  (UNSPEC_LSETUP_START  12)
> +  (UNSPEC_LSETUP_END13)
>
>(UNSPECV_SET_FP1)
>(UNSPECV_ENTRY2)
> @@ -1289,41 +1292,67 @@
> (set_attr "length""3")])
>
>
> +;; Hardware loop support.
> +
>  ;; Define the loop insns used by bct optimization to represent the
> -;; start and end of a zero-overhead loop (in loop.c).  This start
> -;; template generates the loop insn; the end template doesn't generate
> -;; any instructions since loop end is handled in hardware.
> +;; start and end of a zero-overhead loop.  This start template generates
> +;; the loop insn; the end template doesn't generate any instructions since
> +;; loop end is handled in hardware.
>
>  (define_insn "zero_cost_loop_start"
>[(set (pc)
> -(if_then_else (eq (match_operand:SI 0 "register_operand" "a")
> -  (const_int 0))
> -  (label_ref (match_operand 1 "" ""))
> -  (pc)))
> -   (set (reg:SI 19)
> -(plus:SI (match_dup 0) (const_int -1)))]
> +(if_then_else (ne (match_operand:SI 0 "register_operand" "a")
> +  (const_int 1))
> +  (label_ref (match_operand 1 "" ""))
> +  (pc)))
> +   (set (match_operand:SI 2 "register_operand" "+a0")
> +(plus (match_dup 2)
> +  (const_int -1)))
> +   (unspec [(const_int 0)] UNSPEC_LSETUP_START)]
>""
> -  "loopnez\t%0, %l1"
> +  "loop\t%0, %l1_LEND"
>[(set_attr "type""jump")
> (set_attr "mode""none")
> (set_attr "length""3")])
>
>  (define_insn "zero_cost_loop_end"
>[(set (pc)
> -(if_then_else (ne (reg:SI 19) (const_int 0))
> -  (label_ref (match_operand 0 "" ""))
> -  (pc)))
> -   (set (reg:SI 19)
> -(plus:SI (reg:SI 19) (const_int -1)))]
> +(if_then_else (ne (match_operand:SI 0 "register_operand" "a")
> +  (const_int 1))
> +  (label_ref (match_operand 1 "" ""))
> +  (pc)))
> +   (set (match_operand:SI 2 "register_operand" "+a0")
> +(plus (match_dup 2)
> +  (const_int -1)))
> +   (unspec [(const_int 0)] UNSPEC_LSETUP_END)]
>"

Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS

2014-01-09 Thread Tom de Vries


On 09-01-14 16:31, Richard Sandiford wrote:

Tom de Vries  writes:

On 25/12/13 14:02, Tom de Vries wrote:

On 07-12-13 16:07, Tom de Vries wrote:

Richard,

This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
address the issue that $6 is sometimes used in split calls.

Build and reg-tested on MIPS.

OK for stage1?





Richard,

Ping.

This patch is the only part of -fuse-caller-save that still needs approval.




Richard,

thanks for the review.


Hmm, where were parts 4 and 6 approved?


In http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00508.html, Vladimir wrote:
...
The patch is ok for me for trunk at stage1. But I think you need a formal 
approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
(lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a maintainer 
of these parts although these changes look ok for me.

...

In reaction to that, I split up the patch into a patches series, and replied in 
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01255.html:

...
I'm assuming you've ok'ed patch 1, 2, 3, 4, 6, 8, 9 and the non-df-scan part of 
7.

I'll ask other maintainers about the other parts (5, 10 and the df-scan part of 
7).
...


 Was looking for the discussion
in the hope that it would answer the question I don't really understand,
which is: this hook is only used during final, is that right?


Yes.


And the
clobber that you're adding is exposed at the rtl level.


Yes, after the calls are split, but not before.


So why do we
need the hook at all?


In general we need the hook for registers that are clobbered during a call to a 
function, while the registers are not present in the final rtl representation of 
that function.


For MIPS, we don't need the hook for that purpose.

But, for MIPS there's the following issue: the unsplit call clobbers r6, but the 
clobber is not explicit in the rtl. Only after splitting, the clobber becomes 
explicit in the rtl.


In general, that's not a problem because r6 is a member of the set of register 
clobbered by a call (CALL_REALLY_USED_REGISTERS), so it's implicitly clobbered.


But for -fuse-caller-save, when we find a call, we ignore 
CALL_REALLY_USED_REGISTERS and use a potentially smaller set of implicit 
clobbers: the union of:

- the registers usage analysis of the final rtl representation of the called
  function
- the registers marked by the hook.
So before splitting the unsplit call, there's nothing to tell us that r6 is 
clobbered by that call.  Resulting in register allocation using r6 as if it was 
not clobbered, which causes errors.



 Why not just collect the usage information at
the end of final rather than at the beginning, so that all splits during
final have been done?


If we have a call to a leaf function, the final rtl representation does not 
contain calls. The problem does not lie in the final pass where the callee is 
analyzed, but in the caller, where information is used, and where the unsplit 
call is missing the clobber of r6.



For other cases (where the usage isn't explicit
at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
instead?



Right, we could add the r6 clobber that way. But to keep things simple, I've 
used the hook instead.


Thanks,
- Tom


Thanks,
Richard

[committed] Fix vect_analyze_data_refs (PR middle-end/59670)

2014-01-09 Thread Jakub Jelinek

Hi!

I forgot to check is_gimple_call before checking gimple_call_internal_p.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed as obvious.

2014-01-09  Jakub Jelinek  

PR middle-end/59670
* tree-vect-data-refs.c (vect_analyze_data_refs): Check
is_gimple_call before calling gimple_call_internal_p.

* gcc.dg/pr59670.c: New test.

--- gcc/tree-vect-data-refs.c.jj2014-01-03 11:40:57.0 +0100
+++ gcc/tree-vect-data-refs.c   2014-01-09 18:32:11.051319627 +0100
@@ -3320,9 +3320,10 @@ again:
{
  gimple def = SSA_NAME_DEF_STMT (off);
  tree reft = TREE_TYPE (DR_REF (newdr));
- if (gimple_call_internal_p (def)
- && gimple_call_internal_fn (def)
- == IFN_GOMP_SIMD_LANE)
+ if (is_gimple_call (def)
+ && gimple_call_internal_p (def)
+ && (gimple_call_internal_fn (def)
+ == IFN_GOMP_SIMD_LANE))
{
  tree arg = gimple_call_arg (def, 0);
  gcc_assert (TREE_CODE (arg) == SSA_NAME);
--- gcc/testsuite/gcc.dg/pr59670.c.jj   2014-01-09 18:36:15.184067750 +0100
+++ gcc/testsuite/gcc.dg/pr59670.c  2014-01-09 18:36:02.0 +0100
@@ -0,0 +1,15 @@
+/* PR middle-end/59670 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fopenmp-simd" } */
+
+int d[1024];
+
+int
+foo (int j, int b)
+{
+  int l, c = 0;
+#pragma omp simd reduction(+: c)
+  for (l = 0; l < b; ++l)
+c += d[j + l];
+  return c;
+}

Jakub

[patch] fix libstdc++/59680

2014-01-09 Thread Jonathan Wakely

PR libstdc++/59680
* src/c++11/thread.cc (__sleep_for): Fix call to ::sleep.

Tested x86_64-linux, and tested again with a hacked c++config.h to use
::sleep(), committed to trunk.
commit 3de5274663e0214b8c5335e99ef85f325f709840
Author: Jonathan Wakely 
Date:   Thu Jan 9 14:33:20 2014 +

PR libstdc++/59680
* src/c++11/thread.cc (__sleep_for): Fix call to ::sleep.

diff --git a/libstdc++-v3/src/c++11/thread.cc b/libstdc++-v3/src/c++11/thread.cc
index d7c3fb1..49aacb5 100644
--- a/libstdc++-v3/src/c++11/thread.cc
+++ b/libstdc++-v3/src/c++11/thread.cc
@@ -183,7 +183,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 ::usleep(__us);
   }
 # else
-::sleep(__s.count() + (__ns >= 100));
+::sleep(__s.count() + (__ns.count() >= 100));
 # endif
 #elif defined(_GLIBCXX_HAVE_WIN32_SLEEP)
 unsigned long ms = __ns.count() / 100;

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 08:42:03AM -0800, Richard Henderson wrote:
> On 01/09/2014 08:35 AM, Jakub Jelinek wrote:
> > That would be fine for 1), but would mean 2).  It is also fine to GC
> > allocate each structure individually, but some (like bb_reorder) are say
> > just 4 bytes long, so it might be overkill.
> 
> 
> Hmm..  Perhaps define the whole structure as you do, but somewhere global
> enough that ggc-page.c can see it, and add to the extra_order_size_table?
> I don't know how much memory wastage there would be there, but I can't imagine
> it's as much as 0.5MB.

Here is the solution I was talking about earlier, allocate smaller structs
through GC together and larger separately.  Bootstrapped/regtested on
x86_64-linux and i686-linux.

2014-01-09  Jakub Jelinek  

* target-globals.c (save_target_globals): Allocate < 4KB structs using
GC in payload of target_globals struct instead of allocating them on
the heap and the larger structs separately using GC.
* target-globals.h (struct target_globals): Make regs, hard_regs,
reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
of GTY((skip)) and change type to void *.
(reset_target_globals): Cast loads from those fields to corresponding
types.

--- gcc/target-globals.h.jj 2014-01-09 19:24:20.0 +0100
+++ gcc/target-globals.h2014-01-09 19:39:43.879348712 +0100
@@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
 
 struct GTY(()) target_globals {
   struct target_flag_state *GTY((skip)) flag_state;
-  struct target_regs *GTY((skip)) regs;
+  void *GTY((atomic)) regs;
   struct target_rtl *rtl;
-  struct target_hard_regs *GTY((skip)) hard_regs;
-  struct target_reload *GTY((skip)) reload;
-  struct target_expmed *GTY((skip)) expmed;
+  void *GTY((atomic)) hard_regs;
+  void *GTY((atomic)) reload;
+  void *GTY((atomic)) expmed;
   struct target_optabs *GTY((skip)) optabs;
   struct target_libfuncs *libfuncs;
   struct target_cfgloop *GTY((skip)) cfgloop;
-  struct target_ira *GTY((skip)) ira;
-  struct target_ira_int *GTY((skip)) ira_int;
-  struct target_lra_int *GTY((skip)) lra_int;
+  void *GTY((atomic)) ira;
+  void *GTY((atomic)) ira_int;
+  void *GTY((atomic)) lra_int;
   struct target_builtins *GTY((skip)) builtins;
   struct target_gcse *GTY((skip)) gcse;
   struct target_bb_reorder *GTY((skip)) bb_reorder;
@@ -68,17 +68,17 @@ static inline void
 restore_target_globals (struct target_globals *g)
 {
   this_target_flag_state = g->flag_state;
-  this_target_regs = g->regs;
+  this_target_regs = (struct target_regs *) g->regs;
   this_target_rtl = g->rtl;
-  this_target_hard_regs = g->hard_regs;
-  this_target_reload = g->reload;
-  this_target_expmed = g->expmed;
+  this_target_hard_regs = (struct target_hard_regs *) g->hard_regs;
+  this_target_reload = (struct target_reload *) g->reload;
+  this_target_expmed = (struct target_expmed *) g->expmed;
   this_target_optabs = g->optabs;
   this_target_libfuncs = g->libfuncs;
   this_target_cfgloop = g->cfgloop;
-  this_target_ira = g->ira;
-  this_target_ira_int = g->ira_int;
-  this_target_lra_int = g->lra_int;
+  this_target_ira = (struct target_ira *) g->ira;
+  this_target_ira_int = (struct target_ira_int *) g->ira_int;
+  this_target_lra_int = (struct target_lra_int *) g->lra_int;
   this_target_builtins = g->builtins;
   this_target_gcse = g->gcse;
   this_target_bb_reorder = g->bb_reorder;
--- gcc/target-globals.c.jj 2014-01-08 17:44:57.551583153 +0100
+++ gcc/target-globals.c2014-01-09 19:38:21.013760564 +0100
@@ -68,24 +68,44 @@ struct target_globals *
 save_target_globals (void)
 {
   struct target_globals *g;
-
-  g = ggc_alloc_target_globals ();
-  g->flag_state = XCNEW (struct target_flag_state);
-  g->regs = XCNEW (struct target_regs);
+  struct target_globals_extra {
+struct target_globals g;
+struct target_flag_state flag_state;
+struct target_optabs optabs;
+struct target_cfgloop cfgloop;
+struct target_builtins builtins;
+struct target_gcse gcse;
+struct target_bb_reorder bb_reorder;
+struct target_lower_subreg lower_subreg;
+  } *p;
+  p = (struct target_globals_extra *)
+  ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
+  PASS_MEM_STAT);
+  g = (struct target_globals *) p;
+  g->flag_state = &p->flag_state;
+  g->regs = ggc_internal_cleared_alloc_stat (sizeof (struct target_regs)
+PASS_MEM_STAT);
   g->rtl = ggc_alloc_cleared_target_rtl ();
-  g->hard_regs = XCNEW (struct target_hard_regs);
-  g->reload = XCNEW (struct target_reload);
-  g->expmed = XCNEW (struct target_expmed);
-  g->optabs = XCNEW (struct target_optabs);
+  g->hard_regs
+= ggc_internal_cleared_alloc_stat (sizeof (struct target_hard_regs)
+  PASS_MEM_STAT);
+  g->reload = ggc_internal_cleared_alloc_stat (sizeof (struct target_reloa

Go patch committed: Add flattening pass

2014-01-09 Thread Ian Lance Taylor

This patch from Chris Manghane adds a flattening pass to the Go
frontend.  This is a step toward moving more types of expressions into
the backend interface.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 447683c37ddf go/expressions.h
--- a/go/expressions.h	Thu Jan 09 15:16:28 2014 -0800
+++ b/go/expressions.h	Thu Jan 09 15:18:09 2014 -0800
@@ -575,6 +575,18 @@
 	int iota_value)
   { return this->do_lower(gogo, function, inserter, iota_value); }
 
+  // Flatten an expression. This is called after order_evaluation.
+  // FUNCTION is the function we are in; it will be NULL for an
+  // expression initializing a global variable.  INSERTER may be used
+  // to insert statements before the statement or initializer
+  // containing this expression; it is normally used to create
+  // temporary variables. This function must resolve expressions
+  // which could not be fully parsed into their final form.  It
+  // returns the same Expression or a new one.
+  Expression*
+  flatten(Gogo* gogo, Named_object* function, Statement_inserter* inserter)
+  { return this->do_flatten(gogo, function, inserter); }
+
   // Determine the real type of an expression with abstract integer,
   // floating point, or complex type.  TYPE_CONTEXT describes the
   // expected type.
@@ -698,6 +710,12 @@
   do_lower(Gogo*, Named_object*, Statement_inserter*, int)
   { return this; }
 
+  // Return a flattened expression.
+  virtual Expression*
+  do_flatten(Gogo*, Named_object*, Statement_inserter*)
+  { return this; }
+
+
   // Return whether this is a constant expression.
   virtual bool
   do_is_constant() const
diff -r 447683c37ddf go/go.cc
--- a/go/go.cc	Thu Jan 09 15:16:28 2014 -0800
+++ b/go/go.cc	Thu Jan 09 15:18:09 2014 -0800
@@ -119,12 +119,15 @@
   // Use temporary variables to force order of evaluation.
   ::gogo->order_evaluations();
 
+  // Flatten the parse tree.
+  ::gogo->flatten();
+
   // Build thunks for functions which call recover.
   ::gogo->build_recover_thunks();
 
   // Convert complicated go and defer statements into simpler ones.
   ::gogo->simplify_thunk_statements();
-  
+
   // Dump ast, use filename[0] as the base name
   ::gogo->dump_ast(filenames[0]);
 }
diff -r 447683c37ddf go/gogo.cc
--- a/go/gogo.cc	Thu Jan 09 15:16:28 2014 -0800
+++ b/go/gogo.cc	Thu Jan 09 15:18:09 2014 -0800
@@ -2703,6 +2703,169 @@
   this->traverse(&order_eval);
 }
 
+// Traversal to flatten parse tree after order of evaluation rules are applied.
+
+class Flatten : public Traverse
+{
+ public:
+  Flatten(Gogo* gogo, Named_object* function)
+: Traverse(traverse_variables
+	   | traverse_functions
+	   | traverse_statements
+	   | traverse_expressions),
+  gogo_(gogo), function_(function), inserter_()
+  { }
+
+  void
+  set_inserter(const Statement_inserter* inserter)
+  { this->inserter_ = *inserter; }
+
+  int
+  variable(Named_object*);
+
+  int
+  function(Named_object*);
+
+  int
+  statement(Block*, size_t* pindex, Statement*);
+
+  int
+  expression(Expression**);
+
+ private:
+  // General IR.
+  Gogo* gogo_;
+  // The function we are traversing.
+  Named_object* function_;
+  // Current statement inserter for use by expressions.
+  Statement_inserter inserter_;
+};
+
+// Flatten variables.
+
+int
+Flatten::variable(Named_object* no)
+{
+  if (!no->is_variable())
+return TRAVERSE_CONTINUE;
+
+  if (no->is_variable() && no->var_value()->is_global())
+{
+  // Global variables can have loops in their initialization
+  // expressions.  This is handled in flatten_init_expression.
+  no->var_value()->flatten_init_expression(this->gogo_, this->function_,
+   &this->inserter_);
+  return TRAVERSE_CONTINUE;
+}
+
+  go_assert(!no->var_value()->has_pre_init());
+
+  return TRAVERSE_SKIP_COMPONENTS;
+}
+
+// Flatten the body of a function.  Record the function while flattening it,
+// so that we can pass it down when flattening an expression.
+
+int
+Flatten::function(Named_object* no)
+{
+  go_assert(this->function_ == NULL);
+  this->function_ = no;
+  int t = no->func_value()->traverse(this);
+  this->function_ = NULL;
+
+  if (t == TRAVERSE_EXIT)
+return t;
+  return TRAVERSE_SKIP_COMPONENTS;
+}
+
+// Flatten statement parse trees.
+
+int
+Flatten::statement(Block* block, size_t* pindex, Statement* sorig)
+{
+  // Because we explicitly traverse the statement's contents
+  // ourselves, we want to skip block statements here.  There is
+  // nothing to flatten in a block statement.
+  if (sorig->is_block_statement())
+return TRAVERSE_CONTINUE;
+
+  Statement_inserter hold_inserter(this->inserter_);
+  this->inserter_ = Statement_inserter(block, pindex);
+
+  // Flatten the expressions first.
+  int t = sorig->traverse_contents(this);
+  if (t == TRAVERSE_EXIT)
+{
+  this->inserter_ = hold_inserter;
+  return t;
+}
+
+  // Keep flattening until nothing changes.
+  Stat

libgo patch committed: Fix 32-bit memory allocation

2014-01-09 Thread Ian Lance Taylor

This patch to libgo fixes memory allocation on 32-bit systems when a lot
of memory has been allocated.  The problem is described in this patch to
the master repository: https://codereview.appspot.com/49460043 .

runtime: fix 32-bit malloc for pointers >= 0x8000

The spans array is allocated in runtime·mallocinit.  On a
32-bit system the number of entries in the spans array is
MaxArena32 / PageSize, which (2U << 30) / (1 << 12) == (1 << 19).
So we are allocating an array that can hold 19 bits for an
index that can hold 20 bits.  According to the comment in the
function, this is intentional: we only allocate enough spans
(and bitmaps) for a 2G arena, because allocating more would
probably be wasteful.

But since the span index is simply the upper 20 bits of the
memory address, this scheme only works if memory addresses are
limited to the low 2G of memory.  That would be OK if we were
careful to enforce it, but we're not.  What we are careful to
enforce, in functions like runtime·MHeap_SysAlloc, is that we
always return addresses between the heap's arena_start and
arena_start + MaxArena32.

We generally get away with it because we start allocating just
after the program end, so we only run into trouble with
programs that allocate a lot of memory, enough to get past
address 0x8000.

This changes the code that computes a span index to subtract
arena_start on 32-bit systems just as we currently do on
64-bit systems.

This is the same patch applied to libgo.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu, both 64-bit and 32-bit.
Committed to mainline.

Ian

diff -r f3e5e6e92709 libgo/runtime/malloc.goc
--- a/libgo/runtime/malloc.goc	Wed Jan 08 13:58:47 2014 -0800
+++ b/libgo/runtime/malloc.goc	Thu Jan 09 15:12:36 2014 -0800
@@ -637,8 +637,7 @@
 
 		// (Manually inlined copy of runtime_MHeap_Lookup)
 		p = (uintptr)v>>PageShift;
-		if(sizeof(void*) == 8)
-			p -= (uintptr)runtime_mheap.arena_start >> PageShift;
+		p -= (uintptr)runtime_mheap.arena_start >> PageShift;
 		s = runtime_mheap.spans[p];
 
 		if(s->sizeclass == 0) {
diff -r f3e5e6e92709 libgo/runtime/mgc0.c
--- a/libgo/runtime/mgc0.c	Wed Jan 08 13:58:47 2014 -0800
+++ b/libgo/runtime/mgc0.c	Thu Jan 09 15:12:36 2014 -0800
@@ -269,8 +269,7 @@
 	// (Manually inlined copy of MHeap_LookupMaybe.)
 	k = (uintptr)obj>>PageShift;
 	x = k;
-	if(sizeof(void*) == 8)
-		x -= (uintptr)runtime_mheap.arena_start>>PageShift;
+	x -= (uintptr)runtime_mheap.arena_start>>PageShift;
 	s = runtime_mheap.spans[x];
 	if(s == nil || k < s->start || (byte*)obj >= s->limit || s->state != MSpanInUse)
 		return false;
@@ -453,8 +452,7 @@
 			// (Manually inlined copy of MHeap_LookupMaybe.)
 			k = (uintptr)obj>>PageShift;
 			x = k;
-			if(sizeof(void*) == 8)
-x -= (uintptr)arena_start>>PageShift;
+			x -= (uintptr)arena_start>>PageShift;
 			s = runtime_mheap.spans[x];
 			if(s == nil || k < s->start || obj >= s->limit || s->state != MSpanInUse)
 continue;
@@ -501,8 +499,7 @@
 			// Ask span about size class.
 			// (Manually inlined copy of MHeap_Lookup.)
 			x = (uintptr)obj >> PageShift;
-			if(sizeof(void*) == 8)
-x -= (uintptr)arena_start>>PageShift;
+			x -= (uintptr)arena_start>>PageShift;
 			s = runtime_mheap.spans[x];
 
 			PREFETCH(obj);
@@ -617,8 +614,7 @@
 	if(t == nil)
 		return;
 	x = (uintptr)obj >> PageShift;
-	if(sizeof(void*) == 8)
-		x -= (uintptr)(runtime_mheap.arena_start)>>PageShift;
+	x -= (uintptr)(runtime_mheap.arena_start)>>PageShift;
 	s = runtime_mheap.spans[x];
 	objstart = (byte*)((uintptr)s->startarena_used;
-	if(sizeof(void*) == 8)
-		n -= (uintptr)h->arena_start;
+	n -= (uintptr)h->arena_start;
 	n = n / PageSize * sizeof(h->spans[0]);
 	n = ROUND(n, PageSize);
 	pagesize = getpagesize();
@@ -170,8 +169,7 @@
 		runtime_MSpan_Init(t, s->start + npage, s->npages - npage);
 		s->npages = npage;
 		p = t->start;
-		if(sizeof(void*) == 8)
-			p -= ((uintptr)h->arena_start>>PageShift);
+		p -= ((uintptr)h->arena_start>>PageShift);
 		if(p > 0)
 			h->spans[p-1] = s;
 		h->spans[p] = t;
@@ -189,8 +187,7 @@
 	s->elemsize = (sizeclass==0 ? s->npagesstart;
-	if(sizeof(void*) == 8)
-		p -= ((uintptr)h->arena_start>>PageShift);
+	p -= ((uintptr)h->arena_start>>PageShift);
 	for(n=0; nspans[p+n] = s;
 	return s;
@@ -258,8 +255,7 @@
 	s = runtime_FixAlloc_Alloc(&h->spanalloc);
 	runtime_MSpan_Init(s, (uintptr)v>>PageShift, ask>>PageShift);
 	p = s->start;
-	if(sizeof(void*) == 8)
-		p -= ((uintptr)h->arena_start>>PageShift);
+	p -= ((uintptr)h->arena_start>>PageShift);
 	h->spans[p] = s;
 	h->spans[p + s->npages - 1] = s;
 	s->state = MSpa

Re: PR 59712 patch

2014-01-09 Thread Jonathan Wakely

On 9 January 2014 21:55, François Dumont wrote:
>
> All unordered_* tests run under Linux x86_64.

Please make sure you run the entire testsuite, not just the parts that
seem relevant.

Re: Rb tree node recycling patch

2014-01-09 Thread Paolo Carlini

Hi

> Could you point me to the bugzilla entry you are mentioning ?

libstdc++/29988

Thanks,
Paolo

Re: Question about gimplify.c:gimplify_adjust_omp_clauses_1, GOVD_MAP_TO_ONLY

2014-01-09 Thread Thomas Schwinge

Hi!

On Thu, 09 Jan 2014 23:21:26 +0100, I wrote:
> On Thu, 9 Jan 2014 21:43:35 +0100, Jakub Jelinek  wrote:
> > On Thu, Jan 09, 2014 at 09:38:25PM +0100, Thomas Schwinge wrote:
> > > In gimplify.c:gimplify_adjust_omp_clauses_1, does the case for
> > > GOVD_MAP_TO_ONLY have a real current use case (I couldn't spot any), or
> > > is it "just for completeness"?
> > 
> > It is typically for any artificial vars that are known not to need copying
> > back, such as various artifical vars used for VLAs (say if you do sizeof
> > on vla inside of target region, typesizes etc.).  The testsuite coverage is
> > insufficient here, sure.  GOVD_MAP_TO_ONLY is kind of GOVD_FIRSTPRIVATE
> > for the target regions, as opposed to GOVD_SHARED.
> 
> Thanks, that makes sense (and I put a TODO item up for adding such tests
> to the testsuite), but I still can't manage to find one that actually
> triggers the GOVD_MAP_TO_ONLY case in gimplify_adjust_omp_clauses_1.
> Maybe it's just too late today.

Haha, and here we go; need something nested to trigger this:

  int l = 10;
  float c[l];

#pragma omp target map(c[2:4])
  {
#pragma omp target
{
   int s = sizeof c;
}
  }


Grüße,
 Thomas


pgpjiq6luXvVW.pgp
Description: PGP signature

Re: Question about gimplify.c:gimplify_adjust_omp_clauses_1, GOVD_MAP_TO_ONLY

2014-01-09 Thread Thomas Schwinge

Hi!

On Thu, 9 Jan 2014 21:43:35 +0100, Jakub Jelinek  wrote:
> On Thu, Jan 09, 2014 at 09:38:25PM +0100, Thomas Schwinge wrote:
> > In gimplify.c:gimplify_adjust_omp_clauses_1, does the case for
> > GOVD_MAP_TO_ONLY have a real current use case (I couldn't spot any), or
> > is it "just for completeness"?
> 
> It is typically for any artificial vars that are known not to need copying
> back, such as various artifical vars used for VLAs (say if you do sizeof
> on vla inside of target region, typesizes etc.).  The testsuite coverage is
> insufficient here, sure.  GOVD_MAP_TO_ONLY is kind of GOVD_FIRSTPRIVATE
> for the target regions, as opposed to GOVD_SHARED.

Thanks, that makes sense (and I put a TODO item up for adding such tests
to the testsuite), but I still can't manage to find one that actually
triggers the GOVD_MAP_TO_ONLY case in gimplify_adjust_omp_clauses_1.
Maybe it's just too late today.


Grüße,
 Thomas


pgp9X3Q_pJbkc.pgp
Description: PGP signature

[Patch] Fix PR plugin/59335, plugins not compiling

2014-01-09 Thread Steve Ellcey

Some header files needed by plugins are not present in the installed plugin
include directory.  This patch adds those headers.  I added all the headers
I needed for my plugin plus the ones mentioned in the patch.  I did not add
the x86 specific *.def files but this should fix all the generic platform
issues that are mentioned in the defect.

Tested with my switch-shortcut plugin on MIPS.

OK to checkin?

Steve Ellcey
sell...@mips.com


2014-01-09  Steve Ellcey  

PR plugins/59335
* Makefile.in (PLUGIN_HEADERS): Add gimplify.h, gimple-iterator.h,
gimple-ssa.h, fold-const.h, tree-cfg.h, tree-into-ssa.h,
tree-ssanames.h, print-tree.h, varasm.h, and context.h.


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8eb4f68..76766ba 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3118,7 +3118,9 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
   $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
-  version.h stringpool.h
+  version.h stringpool.h gimplify.h gimple-iterator.h gimple-ssa.h \
+  fold-const.h tree-cfg.h tree-into-ssa.h tree-ssanames.h print-tree.h \
+  varasm.h context.h
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile

Re: Rb tree node recycling patch

2014-01-09 Thread François Dumont


On 01/08/2014 02:34 PM, Paolo Carlini wrote:

Hi,

On 12/27/2013 07:30 PM, François Dumont wrote:
Note that this patch contains also a cleanup of a useless template 
parameter _Is_pod_comparator on _Rb_tree_impl.
The useless parameter is a remnant of an attempt at exploiting the EBO 
for _Rb_tree_impl. At some point Benjamin got a patch from a 
contributor but then had to quickly revert it just in time for the ABI 
freeze because it didn't work. Evrything is recorded in the mailing 
list. Anyway, whatever we do now (more exactly, post 4.9) let's make 
sure we don't break the ABI inadvertently, or, if we actually decide 
do that, we should reconsider the EBO.


About the node recycling idea itself, we got a closely related 
Bugzilla. Is it *exactly* the same issue, or not? Please double check.


Paolo.
.


Could you point me to the bugzilla entry you are mentioning ?

François

PR 59712 patch

2014-01-09 Thread François Dumont


Hi

Here is a patch for this small problem with clang. It is not a 
blocking issue for the 4.9 release but at the same time it is a rather 
safe fix so just tell me if I can commit it.


All unordered_* tests run under Linux x86_64. I haven't clang 
installed at the moment so a clang feedback would be appreciated.


2014-01-10  François Dumont  

* include/bits/hashtable_policy.h: Fix some long lines.
* include/bits/hashtable.h (__hash_code_base_access): Define and
use it to check its _M_bucket_index noexcept qualification. Use
also in place of...
(__access_protected_ctor): ...this.
* testsuite/23_containers/unordered_set/instantiation_neg.cc:
Adapt line number.
* testsuite/23_containers/unordered_set/
not_default_constructible_hash_neg.cc: Likewise.

François
Index: include/bits/hashtable_policy.h
===
--- include/bits/hashtable_policy.h	(revision 206443)
+++ include/bits/hashtable_policy.h	(working copy)
@@ -161,7 +161,8 @@
   __hashtable_alloc& _M_h;
 };
 
-  // Functor similar to the previous one but without any pool of node to recycle.
+  // Functor similar to the previous one but without any pool of node to
+  // recycle.
   template
 struct _AllocNode
 {
@@ -1088,7 +1089,8 @@
 
   std::size_t
   _M_bucket_index(const __node_type* __p, std::size_t __n) const
-	noexcept( noexcept(declval()(declval(), (std::size_t)0)) )
+	noexcept( noexcept(declval()(declval(),
+		   (std::size_t)0)) )
   { return _M_ranged_hash()(_M_extract()(__p->_M_v()), __n); }
 
   void
@@ -1175,7 +1177,8 @@
   std::size_t
   _M_bucket_index(const __node_type* __p, std::size_t __n) const
 	noexcept( noexcept(declval()(declval()))
-		  && noexcept(declval()((__hash_code)0, (std::size_t)0)) )
+		  && noexcept(declval()((__hash_code)0,
+		(std::size_t)0)) )
   { return _M_h2()(_M_h1()(_M_extract()(__p->_M_v())), __n); }
 
   void
Index: include/bits/hashtable.h
===
--- include/bits/hashtable.h	(revision 206443)
+++ include/bits/hashtable.h	(working copy)
@@ -260,9 +260,14 @@
 
   // Compile-time diagnostics.
 
+  // _Hash_code_base has everything protected, so use this derived type to
+  // access it.
+  struct __hash_code_base_access : __hash_code_base
+  { using __hash_code_base::_M_bucket_index; };
+
   // Getting a bucket index from a node shall not throw because it is used
   // in methods (erase, swap...) that shall not throw.
-  static_assert(noexcept(declval()
+  static_assert(noexcept(declval()
 			 ._M_bucket_index((const __node_type*)nullptr,
 	  (std::size_t)0)),
 		"Cache the hash code or qualify your functors involved"
@@ -277,15 +282,11 @@
 		"Functor used to map hash code to bucket index"
 		" must be default constructible");
 
-  // _Hash_code_base has a protected default constructor, so use this
-  // derived type to tell if it's usable.
-  struct __access_protected_ctor : __hash_code_base { };
-
   // When hash codes are not cached local iterator inherits from
   // __hash_code_base above to compute node bucket index so it has to be
   // default constructible.
   static_assert(__if_hash_not_cached<
-		is_default_constructible<__access_protected_ctor>>::value,
+		is_default_constructible<__hash_code_base_access>>::value,
 		"Cache the hash code or make functors involved in hash code"
 		" and bucket index computation default constructible");
 
Index: testsuite/23_containers/unordered_set/instantiation_neg.cc
===
--- testsuite/23_containers/unordered_set/instantiation_neg.cc	(revision 206443)
+++ testsuite/23_containers/unordered_set/instantiation_neg.cc	(working copy)
@@ -19,7 +19,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-error "with noexcept" "" { target *-*-* } 265 }
+// { dg-error "with noexcept" "" { target *-*-* } 270 }
 
 #include 
 
Index: testsuite/23_containers/unordered_set/not_default_constructible_hash_neg.cc
===
--- testsuite/23_containers/unordered_set/not_default_constructible_hash_neg.cc	(revision 206443)
+++ testsuite/23_containers/unordered_set/not_default_constructible_hash_neg.cc	(working copy)
@@ -19,7 +19,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-error "default constructible" "" { target *-*-* } 287 }
+// { dg-error "default constructible" "" { target *-*-* } 288 }
 
 #include

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

2014-01-09 Thread Andi Kleen

Tom de Vries  writes:
>
> Is this patch OK for stage1 (after proper retesting)?

Could you perhaps post the latest series first? 

I don't think it made it to the mailing list.

-Andi

Re: Question about gimplify.c:gimplify_adjust_omp_clauses_1, GOVD_MAP_TO_ONLY

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 09:38:25PM +0100, Thomas Schwinge wrote:
> In gimplify.c:gimplify_adjust_omp_clauses_1, does the case for
> GOVD_MAP_TO_ONLY have a real current use case (I couldn't spot any), or
> is it "just for completeness"?

It is typically for any artificial vars that are known not to need copying
back, such as various artifical vars used for VLAs (say if you do sizeof
on vla inside of target region, typesizes etc.).  The testsuite coverage is
insufficient here, sure.  GOVD_MAP_TO_ONLY is kind of GOVD_FIRSTPRIVATE
for the target regions, as opposed to GOVD_SHARED.

Jakub

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

2014-01-09 Thread Tom de Vries


On 09-01-14 15:41, Richard Earnshaw wrote:

On 30/03/13 16:10, Tom de Vries wrote:

On 29/03/13 13:54, Tom de Vries wrote:

I split the patch up into 10 patches, to facilitate further review:
...
0001-Add-command-line-option.patch
0002-Add-new-reg-note-REG_CALL_DECL.patch
0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
0006-Collect-register-usage-information.patch
0007-Use-collected-register-usage-information.patch
0008-Enable-by-default-at-O2-and-higher.patch
0009-Add-documentation.patch
0010-Add-test-case.patch
...
I'll post these in reply to this email.



Something went wrong with those emails, which were generated.

I tested the emails by sending them to my work email, where they looked fine.
I managed to reproduce the problem by sending them to my private email.
It seems the problem was inconsistent EOL format.

I've written a python script to handle composing the email, and posted it here
using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
Given that that email looks ok, I think I've addressed the problems now.

I'll repost the patches. Sorry about the noise.

Thanks,
- Tom




It's unfortunate that this feature doesn't fail safe when a port has not
explicitly defined what should happen.



Richard,

Attached tentative patch (an update of patch 4 in the series) changes the hook 
in the way you propose.


Is this patch OK for stage1 (after proper retesting)?


Consequently, you'll need to add a patch for AArch64 which has two
registers clobbered by PLT-based calls.



Thanks for pointing that out. That's r16 and r17, right? I can propose the hook 
for AArch64, once we all agree on how the hook should look.


Thanks,
- Tom


R.



2013-04-29  Radovan Obradovic  
Tom de Vries  

	* hooks.c (hook_bool_hard_reg_set_containerp_false): New function.
	* hooks.h (hook_bool_hard_reg_set_containerp_false): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f204936..1bae6bb 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -5016,6 +5017,14 @@ normally defined in @file{libgcc2.c}.
 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
 @end deftypefn
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function.  This hook returns true if it managed to determine which registers need to be added.  The default version of this hook returns false.
+@end deftypefn
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 50f412c..bf75446 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2720,6 +2720,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -3985,6 +3986,12 @@ the function prologue.  Normally, the profiling code comes after.
 
 @hook TARGET_SUPPORTS_SPLIT_STACK
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@hook TARGET_FN_OTHER_HARD_REG_USAGE
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 1c67bdf..44f1d06 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -467,3 +467,12 @@ void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes a struct hard_reg_set_container * and returns
+   false.  */
+
+bool
+hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
+{
+  return false;
+}
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 896b41d..f0afdbd 100644
--- a/gcc/hooks.h
+++ b/

Question about gimplify.c:gimplify_adjust_omp_clauses_1, GOVD_MAP_TO_ONLY

2014-01-09 Thread Thomas Schwinge

Hi!

In gimplify.c:gimplify_adjust_omp_clauses_1, does the case for
GOVD_MAP_TO_ONLY have a real current use case (I couldn't spot any), or
is it "just for completeness"?

diff --git gcc/gimplify.c gcc/gimplify.c
index 3738589..870550c 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6136,9 +6136,9 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void 
*data)
 OMP_CLAUSE_PRIVATE_OUTER_REF (clause) = 1;
   else if (code == OMP_CLAUSE_MAP)
 {
-  OMP_CLAUSE_MAP_KIND (clause) = flags & GOVD_MAP_TO_ONLY
-? OMP_CLAUSE_MAP_TO
-: OMP_CLAUSE_MAP_TOFROM;
+  gcc_assert (!(flags & GOVD_MAP_TO_ONLY));
+  OMP_CLAUSE_MAP_KIND (clause) = OMP_CLAUSE_MAP_TOFROM;
+
   if (DECL_SIZE (decl)
  && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
{


Grüße,
 Thomas


pgp7jRLKoWh4x.pgp
Description: PGP signature

Re: [Patch] Remove references to non-existent tree-flow.h file

2014-01-09 Thread Richard Biener

Steve Ellcey   wrote:
>While looking at PR 59335 (plugin doesn't build) I saw the comments
>about
>tree-flow.h and tree-flow-inline.h not existing anymore.  While these
>files have been removed there are still some references to them in
>Makefile.in, doc/tree-ssa.texi, and a couple of source files.  This
>patch
>removes the references to these now-nonexistent files.
>
>OK to checkin?

Ok.

Thanks,
Richard.

>Steve Ellcey
>sell...@mips.com
>
>
>2014-01-09  Steve Ellcey  
>
>   * Makefile.in (TREE_FLOW_H): Remove.
>   (TREE_SSA_H): Add files names from tree-flow.h.
>   * doc/tree-ssa.texi (Annotations): Remove reference to tree-flow.h
>   * tree.h: Remove tree-flow.h reference.
>   * hash-table.h: Remove tree-flow.h reference.
>   * tree-ssa-loop-niter.c (dump_affine_iv): Replace tree-flow.h
>   reference with tree-ssa-loop.h.
>
>
>diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>index 459b1ba..8eb4f68 100644
>--- a/gcc/Makefile.in
>+++ b/gcc/Makefile.in
>@@ -929,11 +929,10 @@ CPP_ID_DATA_H = $(CPPLIB_H)
>$(srcdir)/../libcpp/include/cpp-id-data.h
> CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H)
> TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H)
> TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H)
>-TREE_FLOW_H = tree-flow.h tree-flow-inline.h tree-ssa-operands.h \
>+TREE_SSA_H = tree-ssa.h tree-ssa-operands.h \
>   $(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \
>   $(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
>   tree-ssa-alias.h
>-TREE_SSA_H = tree-ssa.h $(TREE_FLOW_H)
> PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H)
> TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H)
> GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H)
>diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi
>index 391dba8..e0238bd 100644
>--- a/gcc/doc/tree-ssa.texi
>+++ b/gcc/doc/tree-ssa.texi
>@@ -53,9 +53,6 @@ variable has aliases.  All these attributes are
>stored in data
> structures called annotations which are then linked to the field
> @code{ann} in @code{struct tree_common}.
> 
>-Presently, we define annotations for variables (@code{var_ann_t}).
>-Annotations are defined and documented in @file{tree-flow.h}.
>-
> 
> @node SSA Operands
> @section SSA Operands
>diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>index 2b04067..034385c 100644
>--- a/gcc/hash-table.h
>+++ b/gcc/hash-table.h
>@@ -1050,10 +1050,7 @@ hash_table ::end ()
> 
> /* Iterate through the elements of hash_table HTAB,
>using hash_table <>::iterator ITER,
>-   storing each element in RESULT, which is of type TYPE.
>-
>-   This macro has this form for compatibility with the
>-   FOR_EACH_HTAB_ELEMENT currently defined in tree-flow.h.  */
>+   storing each element in RESULT, which is of type TYPE.  */
> 
> #define FOR_EACH_HASH_TABLE_ELEMENT(HTAB, RESULT, TYPE, ITER) \
>   for ((ITER) = (HTAB).begin (); \
>diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>index 5a10297..7628363 100644
>--- a/gcc/tree-ssa-loop-niter.c
>+++ b/gcc/tree-ssa-loop-niter.c
>@@ -1311,7 +1311,7 @@ dump_affine_iv (FILE *file, affine_iv *iv)
>if EVERY_ITERATION is true, we know the test is executed on every
>iteration.
> 
>The results (number of iterations and assumptions as described in
>-   comments at struct tree_niter_desc in tree-flow.h) are stored to
>NITER.
>+   comments at struct tree_niter_desc in tree-ssa-loop.h) are stored
>to NITER.
>Returns false if it fails to determine number of iterations, true if it
>was determined (possibly with some assumptions).  */
> 
>diff --git a/gcc/tree.h b/gcc/tree.h
>index fa79b6f..67454b7 100644
>--- a/gcc/tree.h
>+++ b/gcc/tree.h
>@@ -1114,9 +1114,6 @@ extern void protected_set_expr_location (tree,
>location_t);
>the given label expression.  */
>#define LABEL_EXPR_LABEL(NODE)  TREE_OPERAND (LABEL_EXPR_CHECK (NODE),
>0)
> 
>-/* VDEF_EXPR accessors are specified in tree-flow.h, along with the
>other
>-   accessors for SSA operands.  */
>-
> /* CATCH_EXPR accessors.  */
> #define CATCH_TYPES(NODE) TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 0)
> #define CATCH_BODY(NODE)  TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 1)

Re: [PATCH] Ignore DECL_ALIGN of SSA_NAME underlying decls for dynamic stack realignment (PR middle-end/47735)

2014-01-09 Thread Richard Biener

Jakub Jelinek  wrote:
>Hi!
>
>As discussed in the PR, if a var isn't addressable and has gimple reg
>type,
>I don't see any point to honor it's DECL_ALIGN, we only refer to the
>var through SSA_NAME_VAR of SSA_NAMEs, nothing is allocated on the
>stack
>immediately and the SSA_NAMEs are turned into pseudos for which we only
>care
>about their modes and corresponding alignments if they need to be
>spilled to
>stack.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

>2014-01-09  Jakub Jelinek  
>
>   PR middle-end/47735
>   * cfgexpand.c (expand_one_var): For SSA_NAMEs, if the underlying
>   var satisfies use_register_for_decl, just take into account type
>   alignment, rather than decl alignment.
>
>   * gcc.target/i386/pr47735.c: New test.
>
>--- gcc/cfgexpand.c.jj 2014-01-08 19:37:33.630986939 +0100
>+++ gcc/cfgexpand.c2014-01-09 13:38:45.073324129 +0100
>@@ -1215,8 +1215,11 @@ expand_one_var (tree var, bool toplevel,
>we conservatively assume it will be on stack even if VAR is
>eventually put into register after RA pass.  For non-automatic
>variables, which won't be on stack, we collect alignment of
>-   type and ignore user specified alignment.  */
>-  if (TREE_STATIC (var) || DECL_EXTERNAL (var))
>+   type and ignore user specified alignment.  Similarly for
>+   SSA_NAMEs for which use_register_for_decl returns true.  */
>+  if (TREE_STATIC (var)
>+|| DECL_EXTERNAL (var)
>+|| (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl
>(var)))
>   align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
>  TYPE_MODE (TREE_TYPE (var)),
>  TYPE_ALIGN (TREE_TYPE (var)));
>--- gcc/testsuite/gcc.target/i386/pr47735.c.jj 2014-01-09
>13:30:14.410941107 +0100
>+++ gcc/testsuite/gcc.target/i386/pr47735.c2014-01-09
>13:28:45.0 +0100
>@@ -0,0 +1,16 @@
>+/* PR middle-end/47735 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fomit-frame-pointer" } */
>+
>+unsigned
>+mulh (unsigned a, unsigned b)
>+{
>+  unsigned long long l __attribute__ ((aligned (32)))
>+= ((unsigned long long) a * (unsigned long long) b) >> 32;
>+  return l;
>+}
>+
>+/* No need to dynamically realign the stack here.  */
>+/* { dg-final { scan-assembler-not "and\[^\n\r]*%\[re\]sp" } } */
>+/* Nor use a frame pointer.  */
>+/* { dg-final { scan-assembler-not "%\[re\]bp" } } */
>
>   Jakub

[MIPS, committed] Backport bswap patches to 4.8

2014-01-09 Thread Richard Sandiford

I got an off-list request to backport the bswap patterns to 4.8.
They've been in trunk for a while without any problems being reported
and they should be relatively safe.

Here's what I applied after testing on mips64-linux-gnu (with
--with-arch=mips64r2).  The only difference is the use of:

  [(set_attr "length" "8")]

rather than trunk's:

  [(set_attr "insn_count" "2")]

Thanks,
Richard


gcc/
* config/mips/mips.h (ISA_HAS_WSBH): Define.
* config/mips/mips.md (UNSPEC_WSBH, UNSPEC_DSBH, UNSPEC_DSHD): New
constants.
(bswaphi2, bswapsi2, bswapdi2, wsbh, dsbh, dshd): New patterns.

gcc/testsuite/
* gcc.target/mips/bswap-1.c, gcc.target/mips/bswap-2.c,
gcc.target/mips/bswap-3.c, gcc.target/mips/bswap-4.c,
gcc.target/mips/bswap-5.c, gcc.target/mips/bswap-6.c: New tests.

Index: gcc/config/mips/mips.h
===
--- gcc/config/mips/mips.h  2014-01-09 14:59:58.086893612 +
+++ gcc/config/mips/mips.h  2014-01-09 15:01:20.065606374 +
@@ -949,6 +949,11 @@ #define ISA_HAS_ROR((ISA_MIPS32R2  
\
  || TARGET_SMARTMIPS)  \
 && !TARGET_MIPS16)
 
+/* ISA has the WSBH (word swap bytes within halfwords) instruction.
+   64-bit targets also provide DSBH and DSHD.  */
+#define ISA_HAS_WSBH   ((ISA_MIPS32R2 || ISA_MIPS64R2) \
+&& !TARGET_MIPS16)
+
 /* ISA has data prefetch instructions.  This controls use of 'pref'.  */
 #define ISA_HAS_PREFETCH   ((ISA_MIPS4 \
  || TARGET_LOONGSON_2EF\
Index: gcc/config/mips/mips.md
===
--- gcc/config/mips/mips.md 2014-01-09 15:01:08.387504683 +
+++ gcc/config/mips/mips.md 2014-01-09 15:02:06.590011975 +
@@ -73,6 +73,11 @@ (define_c_enum "unspec" [
   UNSPEC_STORE_LEFT
   UNSPEC_STORE_RIGHT
 
+  ;; Integer operations that are too cumbersome to describe directly.
+  UNSPEC_WSBH
+  UNSPEC_DSBH
+  UNSPEC_DSHD
+
   ;; Floating-point moves.
   UNSPEC_LOAD_LOW
   UNSPEC_LOAD_HIGH
@@ -5379,6 +5384,56 @@ (define_insn "rotr3"
 }
   [(set_attr "type" "shift")
(set_attr "mode" "")])
+
+(define_insn "bswaphi2"
+  [(set (match_operand:HI 0 "register_operand" "=d")
+   (bswap:HI (match_operand:HI 1 "register_operand" "d")))]
+  "ISA_HAS_WSBH"
+  "wsbh\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn_and_split "bswapsi2"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+   (bswap:SI (match_operand:SI 1 "register_operand" "d")))]
+  "ISA_HAS_WSBH && ISA_HAS_ROR"
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_WSBH))
+   (set (match_dup 0) (rotatert:SI (match_dup 0) (const_int 16)))]
+  ""
+  [(set_attr "length" "8")])
+
+(define_insn_and_split "bswapdi2"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+   (bswap:DI (match_operand:DI 1 "register_operand" "d")))]
+  "TARGET_64BIT && ISA_HAS_WSBH"
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:DI [(match_dup 1)] UNSPEC_DSBH))
+   (set (match_dup 0) (unspec:DI [(match_dup 0)] UNSPEC_DSHD))]
+  ""
+  [(set_attr "length" "8")])
+
+(define_insn "wsbh"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+   (unspec:SI [(match_operand:SI 1 "register_operand" "d")] UNSPEC_WSBH))]
+  "ISA_HAS_WSBH"
+  "wsbh\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "dsbh"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+   (unspec:DI [(match_operand:DI 1 "register_operand" "d")] UNSPEC_DSBH))]
+  "TARGET_64BIT && ISA_HAS_WSBH"
+  "dsbh\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "dshd"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+   (unspec:DI [(match_operand:DI 1 "register_operand" "d")] UNSPEC_DSHD))]
+  "TARGET_64BIT && ISA_HAS_WSBH"
+  "dshd\t%0,%1"
+  [(set_attr "type" "shift")])
 
 ;;
 ;;  
Index: gcc/testsuite/gcc.target/mips/bswap-1.c
===
--- /dev/null   2013-12-26 20:29:50.272541227 +
+++ gcc/testsuite/gcc.target/mips/bswap-1.c 2014-01-09 15:01:20.067606391 
+
@@ -0,0 +1,10 @@
+/* { dg-options "isa_rev>=2" } */
+/* { dg-skip-if "bswap recognition needs expensive optimizations" { *-*-* } { 
"-O0" "-O1" } { "" } } */
+
+NOMIPS16 unsigned short
+foo (unsigned short x)
+{
+  return ((x << 8) & 0xff00) | ((x >> 8) & 0xff);
+}
+
+/* { dg-final { scan-assembler "\twsbh\t" } } */
Index: gcc/testsuite/gcc.target/mips/bswap-2.c
===
--- /dev/null   2013-12-26 20:29:50.272541227 +
+++ gcc/testsuite/gcc.target/mips/bswap-2.c 2014-01-09 15:01:20.067606391 
+
@@ -0,0 +1,9 @@
+/* { dg-options "isa_rev>=2" } */
+
+NOMIPS16 unsigned short
+foo (unsigned short x)
+{
+

Re: [MIPS, committed] Revert some Octeon BADDU patches

2014-01-09 Thread Richard Sandiford

Eric Botcazou  writes:
>> This patch just reverts some changes I'd made to the BADDU patterns
>> for the infamous (truncate:QI (plus:SI ...)) -> (plus:QI ...)
>> simplification. That simplification was limited to CISCy targets for PR
>> 58295.
>> 
>> Tested on mips64-linux-gnu and applied.  It fixes the octeon-baddu-1.c
>> failures.
>
> You presumably need to apply it to the 4.8 branch as well.

Thanks for the heads up.  Applied there too after testing on
mips64-linux-gnu.

Richard

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 5)

2014-01-09 Thread Richard Biener

Jakub Jelinek  wrote:
>On Thu, Jan 09, 2014 at 02:27:40PM +0100, Richard Biener wrote:
>> > Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the
>condition
>> > we can just drop the lhs always in that case, just doing what we do
>for
>> > __builtin_unreachable if lhs is SSA_NAME:
>> >   tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
>> >   tree def = get_or_create_ssa_default_def (cfun, var);
>> >   gsi_insert_after (gsi, gimple_build_assign (lhs, def),
>GSI_NEW_STMT);
>> 
>> That works for me.
>
>So like this?
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok?

Ok,
Thanks.

>2014-01-09  Jakub Jelinek  
>
>   PR tree-optimization/59622
>   * gimple-fold.c (gimple_fold_call): Fix a typo in message.  For
>   __builtin_unreachable replace the OBJ_TYPE_REF call with a call to
>   __builtin_unreachable and add if needed a setter of the lhs SSA_NAME.
>   Don't devirtualize for inplace at all.  For targets.length () == 1,
>   if the call is noreturn and cfun isn't in SSA form yet, clear lhs.
>
>   * g++.dg/opt/pr59622-2.C: New test.
>   * g++.dg/opt/pr59622-3.C: New test.
>   * g++.dg/opt/pr59622-4.C: New test.
>   * g++.dg/opt/pr59622-5.C: New test.
>
>--- gcc/gimple-fold.c.jj   2014-01-08 17:44:57.690582374 +0100
>+++ gcc/gimple-fold.c  2014-01-09 14:34:40.816149806 +0100
>@@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator *
>(OBJ_TYPE_REF_EXPR (callee)
>   {
> fprintf (dump_file,
>- "Type inheritnace inconsistent devirtualization of ");
>+ "Type inheritance inconsistent devirtualization of ");
> print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> fprintf (dump_file, " to ");
> print_generic_expr (dump_file, callee, TDF_SLIM);
>@@ -1177,24 +1177,45 @@ gimple_fold_call (gimple_stmt_iterator *
> gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee));
> changed = true;
>   }
>-  else if (flag_devirtualize && virtual_method_call_p (callee))
>+  else if (flag_devirtualize && !inplace && virtual_method_call_p
>(callee))
>   {
> bool final;
> vec targets
>   = possible_polymorphic_call_targets (callee, &final);
> if (final && targets.length () <= 1)
>   {
>+tree lhs = gimple_call_lhs (stmt);
> if (targets.length () == 1)
>   {
> gimple_call_set_fndecl (stmt, targets[0]->decl);
> changed = true;
>+/* If the call becomes noreturn, remove the lhs.  */
>+if (lhs && (gimple_call_flags (stmt) & ECF_NORETURN))
>+  {
>+if (TREE_CODE (lhs) == SSA_NAME)
>+  {
>+tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
>+tree def = get_or_create_ssa_default_def (cfun, var);
>+gimple new_stmt = gimple_build_assign (lhs, def);
>+gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>+  }
>+gimple_call_set_lhs (stmt, NULL_TREE);
>+  }
>   }
>-else if (!inplace)
>+else
>   {
> tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
> gimple new_stmt = gimple_build_call (fndecl, 0);
> gimple_set_location (new_stmt, gimple_location (stmt));
>-gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>+if (lhs && TREE_CODE (lhs) == SSA_NAME)
>+  {
>+tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
>+tree def = get_or_create_ssa_default_def (cfun, var);
>+gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>+update_call_from_tree (gsi, def);
>+  }
>+else
>+  gsi_replace (gsi, new_stmt, true);
> return true;
>   }
>   }
>--- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj2014-01-09
>10:57:46.246694025 +0100
>+++ gcc/testsuite/g++.dg/opt/pr59622-2.C   2014-01-09 10:57:46.246694025
>+0100
>@@ -0,0 +1,21 @@
>+// PR tree-optimization/59622
>+// { dg-do compile }
>+// { dg-options "-O2" }
>+
>+namespace
>+{
>+  struct A
>+  {
>+A () {}
>+virtual A *bar (int) = 0;
>+A *baz (int x) { return bar (x); }
>+  };
>+}
>+
>+A *a;
>+
>+void
>+foo ()
>+{
>+  a->baz (0);
>+}
>--- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj2014-01-09
>10:57:46.247694040 +0100
>+++ gcc/testsuite/g++.dg/opt/pr59622-3.C   2014-01-09 10:57:46.247694040
>+0100
>@@ -0,0 +1,21 @@
>+// PR tree-optimization/59622
>+// { dg-do compile }
>+// { dg-options "-O2" }
>+
>+struct C { int a; int b; };
>+
>+namespace
>+{
>+  struct A
>+  {
>+virtual C foo ();
>+C bar () { return foo (); }
>+

[Patch, Fortran, committed] Fix buglet in cpp.c

2014-01-09 Thread Tobias Burnus


Committed as obvious: Rev. 206487.

Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 206486)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2014-01-09  Tobias Burnus  
+
+	* cpp.c (gfc_cpp_handle_option): Add missing break.
+	* trans-io.c (transfer_expr): Silence unused value warning.
+
 2014-01-08  Janus Weil  
 
 	PR fortran/58182
Index: gcc/fortran/cpp.c
===
--- gcc/fortran/cpp.c	(Revision 206486)
+++ gcc/fortran/cpp.c	(Arbeitskopie)
@@ -363,6 +363,7 @@ gfc_cpp_handle_option (size_t scode, const char *a
 
 case OPT_Wdate_time:
   gfc_cpp_option.warn_date_time = value;
+  break;
 
 case OPT_A:
 case OPT_D:
Index: gcc/fortran/trans-io.c
===
--- gcc/fortran/trans-io.c	(Revision 206486)
+++ gcc/fortran/trans-io.c	(Arbeitskopie)
@@ -2152,7 +2152,7 @@ transfer_expr (gfc_se * se, gfc_typespec * ts, tre
 	 function, if only referenced in an io statement, requires this
 	 check (see PR58771).  */
   if (ts->u.derived->backend_decl == NULL_TREE)
-	tmp = gfc_typenode_for_spec (ts);
+	(void) gfc_typenode_for_spec (ts);
 
   for (c = ts->u.derived->components; c; c = c->next)
 	{

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-09 Thread Rong Xu

My bad.
Thanks for the fix!

-Rong

On Thu, Jan 9, 2014 at 11:47 AM, H.J. Lu  wrote:
> On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu  wrote:
>> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
 @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
  #endif
  /* crc32 for this program.  */
  static gcov_unsigned_t crc32;
 +/* Use this summary checksum rather the computed one if the value is
 + *non-zero.  */
 +static gcov_unsigned_t saved_summary_checksum;
>>>
>>> Why do you need to save the checksum? Won't it reset summary back with 
>>> multiple streaming?
>>
>> This was for the gcov_tool. checksum will be recomputed in gcov_exit
>> and the value will depend on
>> the order of gcov_info list. (the order will be different after
>> reading from gcda files to memory). The purpose was
>> to have the same summary_checksum so that I can get identical gcov-dump 
>> output.
>>
>>>
>>> I would really like to avoid introducing those static vars that are used 
>>> exclusively
>>> by gcov_exit.  What about putting them into an gcov_context structure that
>>> is passed around the functions that was broken out?
>>
>> With my recently patch the localizes this_prg, we only use 64 more
>> bytes in bss. Do you still we have to remove
>> all these statics?
>>
>>>
>
> libgcc ChangeLog entries should be in libgcc/ChangeLog,
> not gcc/ChangeLog.  I checked in a patch to move them
> to libgcc/ChangeLog.
>
> --
> H.J.

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-09 Thread H.J. Lu

On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu  wrote:
> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>>> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
>>>  #endif
>>>  /* crc32 for this program.  */
>>>  static gcov_unsigned_t crc32;
>>> +/* Use this summary checksum rather the computed one if the value is
>>> + *non-zero.  */
>>> +static gcov_unsigned_t saved_summary_checksum;
>>
>> Why do you need to save the checksum? Won't it reset summary back with 
>> multiple streaming?
>
> This was for the gcov_tool. checksum will be recomputed in gcov_exit
> and the value will depend on
> the order of gcov_info list. (the order will be different after
> reading from gcda files to memory). The purpose was
> to have the same summary_checksum so that I can get identical gcov-dump 
> output.
>
>>
>> I would really like to avoid introducing those static vars that are used 
>> exclusively
>> by gcov_exit.  What about putting them into an gcov_context structure that
>> is passed around the functions that was broken out?
>
> With my recently patch the localizes this_prg, we only use 64 more
> bytes in bss. Do you still we have to remove
> all these statics?
>
>>

libgcc ChangeLog entries should be in libgcc/ChangeLog,
not gcc/ChangeLog.  I checked in a patch to move them
to libgcc/ChangeLog.

-- 
H.J.

Re: PATCH: Remove the unused btver1

2014-01-09 Thread Uros Bizjak

On Thu, Jan 9, 2014 at 8:32 PM, H.J. Lu  wrote:

> btver1 iis never used.  This patch removes it.   It avoids:
>
> insn-attrtab.c:extern int internal_dfa_insn_code_btver1 (rtx);
> insn-attrtab.c:extern int insn_default_latency_btver1 (rtx);
>
> OK to install?

OK.

Thanks,
Uros.

PATCH: Remove the unused btver1

2014-01-09 Thread H.J. Lu

Hi Uros,

btver1 iis never used.  This patch removes it.   It avoids:

insn-attrtab.c:extern int internal_dfa_insn_code_btver1 (rtx);
insn-attrtab.c:extern int insn_default_latency_btver1 (rtx);

OK to install?

Thanks.

-- 
H.J.
--
2014-01-09  H.J. Lu  

* config/i386/i386.md (cpu): Remove the unused btver1.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index de0b2dd..954bbed 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -366,7 +366,7 @@
 ;; Processor type.
 (define_attr "cpu" "none,pentium,pentiumpro,geode,k6,athlon,k8,core2,nehalem,
 atom,slm,generic,amdfam10,bdver1,bdver2,bdver3,bdver4,
-btver1,btver2"
+btver2"
   (const (symbol_ref "ix86_schedule")))

 ;; A basic instruction type.  Refinements due to arguments to be

Re: [Patch, testsuite, mips] Fix test gcc.dg/delay-slot-1.c for MIPS

2014-01-09 Thread Richard Sandiford

"Steve Ellcey "  writes:
> 2014-01-09  Steve Ellcey  
>
>   * gcc.dg/delay-slot-1.c: Add check for 64 bit support.

OK, thanks.  Pedantically it's "Restrict -mabi=64 to 64-bit processors.",
since we're not really checking whether the support is there, but whether
n64 is compatible with the currently-selected processor.

Last time I looked, user options take precedence over test options,
so someone testing specific -mabi options (like I do for mips64-linux-gnu)
won't be affected either way.  And for people testing -march= without -mabi=,
or people using toolchains configured using --with-arch=<32-bit-proc>,
I agree the change is a good thing.

Thanks,
Richard

Re: [PATCH,rs6000] Add -maltivec={le,be} options

2014-01-09 Thread David Edelsohn

On Thu, Jan 9, 2014 at 1:14 PM, Bill Schmidt
 wrote:
> Thanks for the comments!  Here is a second go-round at the patch with
> improved documentation.  I'm happy to change the wording if it can be
> further improved.
>
> Thanks,
> Bill
>
> 2014-01-09  Bill Schmidt  
>
> * doc/invoke.texi: Add -maltivec={be,le} options, and document
> default element-order behavior for -maltivec.
> * config/rs6000/rs6000.opt: Add -maltivec={be,le} options.
> * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure
> that -maltivec={le,be} implies -maltivec; disallow -maltivec=le
> when targeting big endian, at least for now.
> * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG.

The patch and text look good, with the markup fixes requested by Joseph.

Thanks, David

[Patch, testsuite, mips] Fix test gcc.dg/delay-slot-1.c for MIPS

2014-01-09 Thread Steve Ellcey


The gcc.dg/delay-slot-1.c test is failing for MIPS targets that
do not support the 64 bit ABI because it didn't check to see
if that support existed before using the -mabi=64 flag.

This patch fixes the problem by using the mips64 check.

OK to checkin?

Steve Ellcey
sell...@mips.com


2014-01-09  Steve Ellcey  

* gcc.dg/delay-slot-1.c: Add check for 64 bit support.



diff --git a/gcc/testsuite/gcc.dg/delay-slot-1.c 
b/gcc/testsuite/gcc.dg/delay-slot-1.c
index f3bcd8e..bfc0273 100644
--- a/gcc/testsuite/gcc.dg/delay-slot-1.c
+++ b/gcc/testsuite/gcc.dg/delay-slot-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2" } */
-/* { dg-options "-O2 -mabi=64" { target mips-*-linux-* } } */
+/* { dg-options "-O2 -mabi=64" { target { mips*-*-linux* && mips64 } } } */
 
 struct offset_v1 {
 int k_uniqueness;

Re: [PATCH,rs6000] Add -maltivec={le,be} options

2014-01-09 Thread Joseph S. Myers

On Thu, 9 Jan 2014, Bill Schmidt wrote:

> +When -maltivec is used, rather than -maltivec=le or -maltivec=be, the
> +element order for Altivec intrinsics such as vec_splat, vec_extract,
> +and vec_insert will match array element order corresponding to the
> +endianness of the target.  That is, element zero identifies the
> +leftmost element in a vector register when targeting a big-endian
> +platform, and identifies the rightmost element in a vector register
> +when targeting a little-endian platform.

Use @option{} markup around option names and @code{} around intrinsic 
names, here and in the discussion of intrinsics under individual options.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: std::vector move assign patch

2014-01-09 Thread Jonathan Wakely

On 9 January 2014 12:22, H.J. Lu wrote:
> On Fri, Dec 27, 2013 at 10:27 AM, François Dumont  
> wrote:
>> Hi
>>
>> Here is a patch to fix an issue in normal mode during the move
>> assignment. The destination vector allocator instance is moved too during
>> the assignment which is wrong.
>>
>> As I discover this problem while working on issues with management of
>> safe iterators during move operations this patch also fix those issues in
>> the debug mode for the vector container. Fixes for other containers in debug
>> mode will come later.
>>
>> 2013-12-27  François Dumont 
>>
>> * include/bits/stl_vector.h (std::vector<>::_M_move_assign): Pass
>> *this allocator instance when building temporary vector instance
>> so that *this allocator do not get moved.
>> * include/debug/safe_base.h
>> (_Safe_sequence_base(_Safe_sequence_base&&)): New.
>> * include/debug/vector (__gnu_debug::vector<>(vector&&)): Use
>> latter.
>> (__gnu_debug::vector<>(vector&&, const allocator_type&)): Swap
>> safe iterators if the instance is moved.
>> (__gnu_debug::vector<>::operator=(vector&&)): Likewise.
>> * testsuite/23_containers/vector/allocator/move.cc (test01): Add
>> check on a vector iterator.
>> * testsuite/23_containers/vector/allocator/move_assign.cc
>> (test02): Likewise.
>> (test03): New, test with a non-propagating allocator.
>> * testsuite/23_containers/vector/debug/move_assign_neg.cc: New.
>>
>> Tested under Linux x86_64 normal and debug modes.
>>
>> I will be in vacation for a week starting today so if you want to apply it
>> quickly do not hesitate to do it yourself.
>>
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59738

Fixed by the attached patch, tested x86_64-linux and committed to
trunk.  I've also rotated the libstdc++ ChangeLog.


2014-01-09  Jonathan Wakely  

PR libstdc++/59738
* include/bits/stl_vector.h (vector<>::_M_move_assign): Restore
support for non-Movable types.
commit c12a0d112781150c2888de7c63960e22ef4ffcbb
Author: Jonathan Wakely 
Date:   Thu Jan 9 16:50:50 2014 +

PR libstdc++/59738
* include/bits/stl_vector.h (vector<>::_M_move_assign): Restore
support for non-Movable types.

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 3638a8c..2cedd39 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1433,7 +1433,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   void
   _M_move_assign(vector&& __x, std::true_type) noexcept
   {
-   const vector __tmp(std::move(*this), get_allocator());
+   vector __tmp(get_allocator());
+   this->_M_impl._M_swap_data(__tmp._M_impl);
this->_M_impl._M_swap_data(__x._M_impl);
if (_Alloc_traits::_S_propagate_on_move_assign())
  std::__alloc_on_move(_M_get_Tp_allocator(),

Re: [PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types

2014-01-09 Thread Joseph S. Myers

On Thu, 9 Jan 2014, Richard Henderson wrote:

> > This isn't the right conditional.  _FP_W_TYPE_SIZE is ultimately an 
> > optimization choice and need not be related to whether any TImode 
> > functions are being defined using soft-fp, or whether TImode is supported 
> > at all.  I think the most you can do is have sfp-machine.h define a macro 
> > to say that TImode should be supported in soft-fp, rather than actually 
> > defining the types itself.
> 
> The documentation for longlong.h say we must have a double-word type defined.
> Given how easy it is to support a double-word type...

I suppose that's a reason to define TImode types under that condition 
unless and until soft-fp is used with _FP_W_TYPE_SIZE == 64 for an 
architecture not supporting them (there's also the possibility it might be 
used with _FP_W_TYPE_SIZE == 32 but with TImode support wanted, though 
defining the types in sfp-machine.h would of course be possible then).  
But of course the patches need proposing for glibc first (for longlong.h 
things are less clear, as long as a patch applied to one place is promptly 
then applied to the other).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: PATCH: Put a breakpoint on __sanitizer::Report

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 10:28:56AM -0800, H.J. Lu wrote:
> Hi,
> 
> This patch puts a breakpoint on __sanitizer::Report to help
> with debugging sanitizer issues.  OK to install?

Ok.

> 2014-01-09  H.J. Lu  
> 
> * gdbasan.in: Put a breakpoint on __sanitizer::Report.
> diff --git a/gcc/gdbasan.in b/gcc/gdbasan.in
> index cf05825..3a6fca0 100644
> --- a/gcc/gdbasan.in
> +++ b/gcc/gdbasan.in
> @@ -1,3 +1,7 @@
>  # Put a breakpoint on __asan_report_error to help with debugging buffer
>  # overflow.
>  b __asan_report_error
> +
> +# Put a breakpoint on __sanitizer::Report to help with debugging sanitizer
> +# issues.
> +b __sanitizer::Report

Jakub

Re: [patch] PR56572 flatten unnecessary nested transactions after inlining

2014-01-09 Thread Aldy Hernandez


On 01/06/14 13:40, Richard Henderson wrote:

On 12/19/2013 11:06 AM, Richard Biener wrote:

Aldy Hernandez  wrote:

I'd still like to catch the common cases, like I do with this patch.

Perhaps we move this code to the .tmmark pass and handle the
uninstrumented case.  rth?


tmmark is way way later than you'd want.  I believe that ipa_tm is the right
place.  That's where we generate clones.  The clones know a-priori that they're
called within a transaction and thus all internal transations may be
eliminated.  And thus any inlining that would happen after ipa_tm would
properly inline the clone, and thus no more need be done.


I have a patch (attached for reference) removing the nested transactions 
while we are creating the clones (as suggested), but the uninstrumented 
code path complicates things.  I'm afraid I don't have any good news.


Consider this:

inline void f() {
  __transaction_atomic {
  a = 12345;
  }
}

void g() {
  __transaction_atomic {
f();
  }
}

The problem is that when we add the uninstrumented code path later in 
tmipa, we end up with the following for g():


g ()
{
  :
  __transaction_atomic  // SUBCODE=[ GTMA_HAVE_LOAD GTMA_HAVE_STORE ]
  goto ;

  :
  f ();/* < uninstrumented path */
  __builtin__ITM_commitTransaction ();
  goto ;

  :
  f (); [tm-clone]/* < instrumented path */
  __builtin__ITM_commitTransaction ();

  :
  return;

}

Since we only removed the transaction in the clone of f(), plain regular 
f() will still have the additional transaction, so inlining will still 
yield a g() with a nested transaction in the uninstrumented path.


So we're back to square one, needing a separate pass to remove the 
nested transactions, and this pass will unfortunately have to deal with 
the uninstrumented/instrumented paths.


This has taken longer to fix than I expected, so I'm going to put this 
aside for now and concentrate on some P1-P2's.  For the record, since 
you don't like this pass in the .tmmark pass which is WAY late, can we 
have a tree pass right after the IPA passes (thus after inlining)?


I'll add some notes to the PR so we can pick this up later.

Aldy
diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index fe6dc28..59b589c 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -1,5 +1,7 @@
 /* Passes for transactional memory support.
Copyright (C) 2008-2014 Free Software Foundation, Inc.
+   Contributed by Richard Henderson  and
+ Aldy Hernandez .
 
This file is part of GCC.
 
@@ -4106,8 +4108,8 @@ maybe_push_queue (struct cgraph_node *node,
code path.  QUEUE are the basic blocks inside the transaction
represented in REGION.
 
-   Later in split_code_paths() we will add the conditional to choose
-   between the two alternatives.  */
+   Later in the tmmark pass (expand_transaction) we will add the
+   conditional to choose between the two alternatives.  */
 
 static void
 ipa_uninstrument_transaction (struct tm_region *region,
@@ -4192,29 +4194,11 @@ ipa_tm_scan_calls_transaction (struct tm_ipa_cg_data *d,
   bbs = get_tm_region_blocks (r->entry_block, r->exit_blocks, NULL,
  d->transaction_blocks_normal, false);
 
-  // Generate the uninstrumented code path for this transaction.
-  ipa_uninstrument_transaction (r, bbs);
-
   FOR_EACH_VEC_ELT (bbs, i, bb)
ipa_tm_scan_calls_block (callees_p, bb, false);
 
   bbs.release ();
 }
-
-  // ??? copy_bbs should maintain cgraph edges for the blocks as it is
-  // copying them, rather than forcing us to do this externally.
-  rebuild_cgraph_edges ();
-
-  // ??? In ipa_uninstrument_transaction we don't try to update dominators
-  // because copy_bbs doesn't return a VEC like iterate_fix_dominators expects.
-  // Instead, just release dominators here so update_ssa recomputes them.
-  free_dominance_info (CDI_DOMINATORS);
-
-  // When building the uninstrumented code path, copy_bbs will have invoked
-  // create_new_def_for starting an "ssa update context".  There is only one
-  // instance of this context, so resolve ssa updates before moving on to
-  // the next function.
-  update_ssa (TODO_update_ssa);
 }
 
 /* Scan all calls in NODE as if this is the transactional clone,
@@ -4890,10 +4874,11 @@ ipa_tm_create_version_alias (struct cgraph_node *node, 
void *data)
   return false;
 }
 
-/* Create a copy of the function (possibly declaration only) of OLD_NODE,
-   appropriate for the transactional clone.  */
+/* Create a copy of the function (possibly declaration only) of
+   OLD_NODE, appropriate for the transactional clone.  Returns the
+   cgraph node for the newly created clone.  */
 
-static void
+static struct cgraph_node *
 ipa_tm_create_version (struct cgraph_node *old_node)
 {
   tree new_decl, old_decl, tm_name;
@@ -4947,13 +4932,12 @@ ipa_tm_create_version (struct cgraph_node *old_node)
 ipa_tm_mark_forced_by_abi_node (new_node);
 
   /* Do the same thing, but for any aliases of the origin

PATCH: Put a breakpoint on __sanitizer::Report

2014-01-09 Thread H.J. Lu

Hi,

This patch puts a breakpoint on __sanitizer::Report to help
with debugging sanitizer issues.  OK to install?

Thanks.

-- 
H.J.
--

2014-01-09  H.J. Lu  

* gdbasan.in: Put a breakpoint on __sanitizer::Report.
diff --git a/gcc/gdbasan.in b/gcc/gdbasan.in
index cf05825..3a6fca0 100644
--- a/gcc/gdbasan.in
+++ b/gcc/gdbasan.in
@@ -1,3 +1,7 @@
 # Put a breakpoint on __asan_report_error to help with debugging buffer
 # overflow.
 b __asan_report_error
+
+# Put a breakpoint on __sanitizer::Report to help with debugging sanitizer
+# issues.
+b __sanitizer::Report

[PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 5)

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 02:27:40PM +0100, Richard Biener wrote:
> > Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition
> > we can just drop the lhs always in that case, just doing what we do for
> > __builtin_unreachable if lhs is SSA_NAME:
> >   tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
> >   tree def = get_or_create_ssa_default_def (cfun, var);
> >   gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT);
> 
> That works for me.

So like this?

Bootstrapped/regtested on x86_64-linux and i686-linux, ok?

2014-01-09  Jakub Jelinek  

PR tree-optimization/59622
* gimple-fold.c (gimple_fold_call): Fix a typo in message.  For
__builtin_unreachable replace the OBJ_TYPE_REF call with a call to
__builtin_unreachable and add if needed a setter of the lhs SSA_NAME.
Don't devirtualize for inplace at all.  For targets.length () == 1,
if the call is noreturn and cfun isn't in SSA form yet, clear lhs.

* g++.dg/opt/pr59622-2.C: New test.
* g++.dg/opt/pr59622-3.C: New test.
* g++.dg/opt/pr59622-4.C: New test.
* g++.dg/opt/pr59622-5.C: New test.

--- gcc/gimple-fold.c.jj2014-01-08 17:44:57.690582374 +0100
+++ gcc/gimple-fold.c   2014-01-09 14:34:40.816149806 +0100
@@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator *
  (OBJ_TYPE_REF_EXPR 
(callee)
{
  fprintf (dump_file,
-  "Type inheritnace inconsistent devirtualization of ");
+  "Type inheritance inconsistent devirtualization of ");
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
  fprintf (dump_file, " to ");
  print_generic_expr (dump_file, callee, TDF_SLIM);
@@ -1177,24 +1177,45 @@ gimple_fold_call (gimple_stmt_iterator *
  gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee));
  changed = true;
}
-  else if (flag_devirtualize && virtual_method_call_p (callee))
+  else if (flag_devirtualize && !inplace && virtual_method_call_p (callee))
{
  bool final;
  vec targets
= possible_polymorphic_call_targets (callee, &final);
  if (final && targets.length () <= 1)
{
+ tree lhs = gimple_call_lhs (stmt);
  if (targets.length () == 1)
{
  gimple_call_set_fndecl (stmt, targets[0]->decl);
  changed = true;
+ /* If the call becomes noreturn, remove the lhs.  */
+ if (lhs && (gimple_call_flags (stmt) & ECF_NORETURN))
+   {
+ if (TREE_CODE (lhs) == SSA_NAME)
+   {
+ tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
+ tree def = get_or_create_ssa_default_def (cfun, var);
+ gimple new_stmt = gimple_build_assign (lhs, def);
+ gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+   }
+ gimple_call_set_lhs (stmt, NULL_TREE);
+   }
}
- else if (!inplace)
+ else
{
  tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
  gimple new_stmt = gimple_build_call (fndecl, 0);
  gimple_set_location (new_stmt, gimple_location (stmt));
- gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+ if (lhs && TREE_CODE (lhs) == SSA_NAME)
+   {
+ tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
+ tree def = get_or_create_ssa_default_def (cfun, var);
+ gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+ update_call_from_tree (gsi, def);
+   }
+ else
+   gsi_replace (gsi, new_stmt, true);
  return true;
}
}
--- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-09 10:57:46.246694025 
+0100
+++ gcc/testsuite/g++.dg/opt/pr59622-2.C2014-01-09 10:57:46.246694025 
+0100
@@ -0,0 +1,21 @@
+// PR tree-optimization/59622
+// { dg-do compile }
+// { dg-options "-O2" }
+
+namespace
+{
+  struct A
+  {
+A () {}
+virtual A *bar (int) = 0;
+A *baz (int x) { return bar (x); }
+  };
+}
+
+A *a;
+
+void
+foo ()
+{
+  a->baz (0);
+}
--- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj 2014-01-09 10:57:46.247694040 
+0100
+++ gcc/testsuite/g++.dg/opt/pr59622-3.C2014-01-09 10:57:46.247694040 
+0100
@@ -0,0 +1,21 @@
+// PR tree-optimization/59622
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct C { int a; int b; };
+
+namespace
+{
+  struct A
+  {
+virtual C foo ();
+C bar () { return foo (); }
+  };
+}
+
+C
+baz ()
+{
+  A a;
+  return a.bar ();
+}
--- gcc/testsuite/g++.dg/opt/pr59

[PATCH] Ignore DECL_ALIGN of SSA_NAME underlying decls for dynamic stack realignment (PR middle-end/47735)

2014-01-09 Thread Jakub Jelinek

Hi!

As discussed in the PR, if a var isn't addressable and has gimple reg type,
I don't see any point to honor it's DECL_ALIGN, we only refer to the
var through SSA_NAME_VAR of SSA_NAMEs, nothing is allocated on the stack
immediately and the SSA_NAMEs are turned into pseudos for which we only care
about their modes and corresponding alignments if they need to be spilled to
stack.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-01-09  Jakub Jelinek  

PR middle-end/47735
* cfgexpand.c (expand_one_var): For SSA_NAMEs, if the underlying
var satisfies use_register_for_decl, just take into account type
alignment, rather than decl alignment.

* gcc.target/i386/pr47735.c: New test.

--- gcc/cfgexpand.c.jj  2014-01-08 19:37:33.630986939 +0100
+++ gcc/cfgexpand.c 2014-01-09 13:38:45.073324129 +0100
@@ -1215,8 +1215,11 @@ expand_one_var (tree var, bool toplevel,
 we conservatively assume it will be on stack even if VAR is
 eventually put into register after RA pass.  For non-automatic
 variables, which won't be on stack, we collect alignment of
-type and ignore user specified alignment.  */
-  if (TREE_STATIC (var) || DECL_EXTERNAL (var))
+type and ignore user specified alignment.  Similarly for
+SSA_NAMEs for which use_register_for_decl returns true.  */
+  if (TREE_STATIC (var)
+ || DECL_EXTERNAL (var)
+ || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))
align = MINIMUM_ALIGNMENT (TREE_TYPE (var),
   TYPE_MODE (TREE_TYPE (var)),
   TYPE_ALIGN (TREE_TYPE (var)));
--- gcc/testsuite/gcc.target/i386/pr47735.c.jj  2014-01-09 13:30:14.410941107 
+0100
+++ gcc/testsuite/gcc.target/i386/pr47735.c 2014-01-09 13:28:45.0 
+0100
@@ -0,0 +1,16 @@
+/* PR middle-end/47735 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fomit-frame-pointer" } */
+
+unsigned
+mulh (unsigned a, unsigned b)
+{
+  unsigned long long l __attribute__ ((aligned (32)))
+= ((unsigned long long) a * (unsigned long long) b) >> 32;
+  return l;
+}
+
+/* No need to dynamically realign the stack here.  */
+/* { dg-final { scan-assembler-not "and\[^\n\r]*%\[re\]sp" } } */
+/* Nor use a frame pointer.  */
+/* { dg-final { scan-assembler-not "%\[re\]bp" } } */

Jakub

Re: [PATCH,rs6000] Add -maltivec={le,be} options

2014-01-09 Thread Bill Schmidt

Thanks for the comments!  Here is a second go-round at the patch with
improved documentation.  I'm happy to change the wording if it can be
further improved.

Thanks,
Bill

2014-01-09  Bill Schmidt  

* doc/invoke.texi: Add -maltivec={be,le} options, and document
default element-order behavior for -maltivec.
* config/rs6000/rs6000.opt: Add -maltivec={be,le} options.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure
that -maltivec={le,be} implies -maltivec; disallow -maltivec=le
when targeting big endian, at least for now.
* config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG.


Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 206442)
+++ gcc/doc/invoke.texi (working copy)
@@ -18855,6 +18855,37 @@ the AltiVec instruction set.  You may also need to
 @option{-mabi=altivec} to adjust the current ABI with AltiVec ABI
 enhancements.
 
+When -maltivec is used, rather than -maltivec=le or -maltivec=be, the
+element order for Altivec intrinsics such as vec_splat, vec_extract,
+and vec_insert will match array element order corresponding to the
+endianness of the target.  That is, element zero identifies the
+leftmost element in a vector register when targeting a big-endian
+platform, and identifies the rightmost element in a vector register
+when targeting a little-endian platform.
+
+@item -maltivec=be
+@opindex maltivec=be
+Generate Altivec instructions using big-endian element order,
+regardless of whether the target is big- or little-endian.  This is
+the default when targeting a big-endian platform.
+
+The element order is used to interpret element numbers in Altivec
+intrinsics such as vec_splat, vec_extract, and vec_insert.  By
+default, these will match array element order corresponding to the
+endianness for the target.
+
+@item -maltivec=le
+@opindex maltivec=le
+Generate Altivec instructions using little-endian element order,
+regardless of whether the target is big- or little-endian.  This is
+the default when targeting a little-endian platform.  This option is
+currently ignored when targeting a big-endian platform.
+
+The element order is used to interpret element numbers in Altivec
+intrinsics such as vec_splat, vec_extract, and vec_insert.  By
+default, these will match array element order corresponding to the
+endianness for the target.
+
 @item -mvrsave
 @itemx -mno-vrsave
 @opindex mvrsave
Index: gcc/config/rs6000/rs6000.opt
===
--- gcc/config/rs6000/rs6000.opt(revision 206442)
+++ gcc/config/rs6000/rs6000.opt(working copy)
@@ -140,6 +140,14 @@ maltivec
 Target Report Mask(ALTIVEC) Var(rs6000_isa_flags)
 Use AltiVec instructions
 
+maltivec=le
+Target Report RejectNegative Var(rs6000_altivec_element_order, 1) Save
+Generate Altivec instructions using little-endian element order
+
+maltivec=be
+Target Report RejectNegative Var(rs6000_altivec_element_order, 2)
+Generate Altivec instructions using big-endian element order
+
 mhard-dfp
 Target Report Mask(DFP) Var(rs6000_isa_flags)
 Use decimal floating point instructions
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 206442)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3238,6 +3238,18 @@ rs6000_option_override_internal (bool global_init_
   && !(processor_target_table[tune_index].target_enable & OPTION_MASK_HTM))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
+  /* -maltivec={le,be} implies -maltivec.  */
+  if (rs6000_altivec_element_order != 0)
+rs6000_isa_flags |= OPTION_MASK_ALTIVEC;
+
+  /* Disallow -maltivec=le in big endian mode for now.  This is not
+ known to be useful for anyone.  */
+  if (BYTES_BIG_ENDIAN && rs6000_altivec_element_order == 1)
+{
+  warning (0, N_("-maltivec=le not allowed for big-endian targets"));
+  rs6000_altivec_element_order = 0;
+}
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 206442)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -468,6 +468,15 @@ extern int rs6000_vector_align[];
? rs6000_vector_align[(MODE)]   \
: (int)GET_MODE_BITSIZE ((MODE)))
 
+/* Determine the element order to use for vector instructions.  By
+   default we use big-endian element order when targeting big-endian,
+   and little-endian element order when targeting little-endian.  For
+   programs being ported from BE Power to LE Power, it can sometimes
+   be useful to use big-endian element order when targeting little-endian.
+   This is set via -maltivec=be, for example.  */
+#define VECTOR_ELT_ORDER_BIG

Re: [PATCH, go]: Skip some go tests

2014-01-09 Thread Ian Lance Taylor

On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak  wrote:
>
> There are two remaining warnings:
>
> go.test/test/nilcheck.go: unrecognized test line: // errorcheck -0 -N -d=nil
> go.test/test/nilptr3.go: unrecognized test line: // errorcheck -0 -d=nil

Thanks, not sure how I missed those.  Those tests are really testing
specific gc compiler behaviour anyhow, so we should just skip them
with gccgo.  This patch does that.  Committed to mainline.

Ian


2014-01-09  Ian Lance Taylor  

* go.test/go-test.exp (go-gc-tests): Skip nilptr tests that test
the other Go compiler.


Index: go.test/go-test.exp
===
--- go.test/go-test.exp (revision 206473)
+++ go.test/go-test.exp (working copy)
@@ -1143,6 +1143,10 @@ proc go-gc-tests { } {
   || $test_line == "// \$G \$D/pkg.go && pack grcS pkg.a 
pkg.\$A
2> /dev/null && rm pkg.\$A && \$G -I. -u \$D/main.go" } {
# This tests the gc -u option, which gccgo does not
# support.
+   } elseif { $test_line == "// errorcheck -0 -N -d=nil" \
+  || $test_line == "// errorcheck -0 -d=nil" } {
+   # This tests gc nil pointer checks using -d=nil, which
+   # gccgo does not support.
} else {
clone_output "$name: unrecognized test line: $test_line"
unsupported $name

[ping] Re: [patch] Pass -fuse-ld=gold to gccgo on targets supporting -fsplit-stack

2014-01-09 Thread Matthias Klose

ping patch

Am 29.11.2013 14:29, schrieb Matthias Klose:
> to get full advantage of the -fsplit-stack option, gccgo binaries have to be
> linked with gold, not the bfd linker.  When the system linker defaults to the
> bfd linker, then gccgo should explicitly use the gold linker, passing
> fuse-ld=gold, unless another -fuse-ld option is present.  Tested with and
> without having ld.gold on the system.
> 
>   Matthias
>

[C++ testcase, committed] PR 59730

2014-01-09 Thread Paolo Carlini


Hi,

this just adds the testcase to mainline.

Thanks,
Paolo.

//
2014-01-09  Paolo Carlini  

PR c++/59730
* g++.dg/cpp0x/variadic145.C: New.
Index: g++.dg/cpp0x/variadic145.C
===
--- g++.dg/cpp0x/variadic145.C  (revision 0)
+++ g++.dg/cpp0x/variadic145.C  (working copy)
@@ -0,0 +1,13 @@
+// PR c++/59730
+// { dg-do compile { target c++11 } }
+
+template  void declval();
+template  void forward();
+template  class D;
+template 
+class D <_Functor(_Bound_args...)> {
+  template )>
+  void operator()(...) {
+0(forward<_Args>...);
+  }
+};

[Patch] Remove references to non-existent tree-flow.h file

2014-01-09 Thread Steve Ellcey

While looking at PR 59335 (plugin doesn't build) I saw the comments about
tree-flow.h and tree-flow-inline.h not existing anymore.  While these
files have been removed there are still some references to them in
Makefile.in, doc/tree-ssa.texi, and a couple of source files.  This patch
removes the references to these now-nonexistent files.

OK to checkin?

Steve Ellcey
sell...@mips.com


2014-01-09  Steve Ellcey  

* Makefile.in (TREE_FLOW_H): Remove.
(TREE_SSA_H): Add files names from tree-flow.h.
* doc/tree-ssa.texi (Annotations): Remove reference to tree-flow.h
* tree.h: Remove tree-flow.h reference.
* hash-table.h: Remove tree-flow.h reference.
* tree-ssa-loop-niter.c (dump_affine_iv): Replace tree-flow.h
reference with tree-ssa-loop.h.


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 459b1ba..8eb4f68 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -929,11 +929,10 @@ CPP_ID_DATA_H = $(CPPLIB_H) 
$(srcdir)/../libcpp/include/cpp-id-data.h
 CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H)
 TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H)
 TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H)
-TREE_FLOW_H = tree-flow.h tree-flow-inline.h tree-ssa-operands.h \
+TREE_SSA_H = tree-ssa.h tree-ssa-operands.h \
$(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \
$(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
tree-ssa-alias.h
-TREE_SSA_H = tree-ssa.h $(TREE_FLOW_H)
 PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H)
 TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H)
 GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H)
diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi
index 391dba8..e0238bd 100644
--- a/gcc/doc/tree-ssa.texi
+++ b/gcc/doc/tree-ssa.texi
@@ -53,9 +53,6 @@ variable has aliases.  All these attributes are stored in data
 structures called annotations which are then linked to the field
 @code{ann} in @code{struct tree_common}.
 
-Presently, we define annotations for variables (@code{var_ann_t}).
-Annotations are defined and documented in @file{tree-flow.h}.
-
 
 @node SSA Operands
 @section SSA Operands
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 2b04067..034385c 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -1050,10 +1050,7 @@ hash_table ::end ()
 
 /* Iterate through the elements of hash_table HTAB,
using hash_table <>::iterator ITER,
-   storing each element in RESULT, which is of type TYPE.
-
-   This macro has this form for compatibility with the
-   FOR_EACH_HTAB_ELEMENT currently defined in tree-flow.h.  */
+   storing each element in RESULT, which is of type TYPE.  */
 
 #define FOR_EACH_HASH_TABLE_ELEMENT(HTAB, RESULT, TYPE, ITER) \
   for ((ITER) = (HTAB).begin (); \
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 5a10297..7628363 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1311,7 +1311,7 @@ dump_affine_iv (FILE *file, affine_iv *iv)
if EVERY_ITERATION is true, we know the test is executed on every iteration.
 
The results (number of iterations and assumptions as described in
-   comments at struct tree_niter_desc in tree-flow.h) are stored to NITER.
+   comments at struct tree_niter_desc in tree-ssa-loop.h) are stored to NITER.
Returns false if it fails to determine number of iterations, true if it
was determined (possibly with some assumptions).  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index fa79b6f..67454b7 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1114,9 +1114,6 @@ extern void protected_set_expr_location (tree, 
location_t);
the given label expression.  */
 #define LABEL_EXPR_LABEL(NODE)  TREE_OPERAND (LABEL_EXPR_CHECK (NODE), 0)
 
-/* VDEF_EXPR accessors are specified in tree-flow.h, along with the other
-   accessors for SSA operands.  */
-
 /* CATCH_EXPR accessors.  */
 #define CATCH_TYPES(NODE)  TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 0)
 #define CATCH_BODY(NODE)   TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 1)

[PATCH][ARM] Get mode for rtx costs calculations for SET RTX from destination reg

2014-01-09 Thread Kyrill Tkachov


Hi all,

SET RTXs don't have a mode, so the code to calculate a reg-to-reg set in the arm 
rtx costs function needs to get the mode from one of the registers involved. We 
already did that when the source is a CONST_INT.


This patch fixes that oversight and also prevents us from falling through or 
recursing, since the cost calculated for (set (reg) (reg)) should be final at 
that point.


Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill

2014-01-09  Kyrylo Tkachov  

* config/arm/arm.c (arm_new_rtx_costs): Use destination mode
when handling a SET rtx.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c8bf7c1..4c991c2 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9092,6 +9092,9 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 {
 case SET:
   *cost = 0;
+  /* SET RTXs don't have a mode so we get it from the destination.  */
+  mode = GET_MODE (SET_DEST (x));
+
   if (REG_P (SET_SRC (x))
 	  && REG_P (SET_DEST (x)))
 	{
@@ -9106,6 +9109,8 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	 in 16 bits in Thumb mode.  */
 	  if (!speed_p && TARGET_THUMB && outer_code == COND_EXEC)
 	*cost >>= 1;
+
+	  return true;
 	}
 
   if (CONST_INT_P (SET_SRC (x)))
@@ -9113,7 +9118,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  /* Handle CONST_INT here, since the value doesn't have a mode
 	 and we would otherwise be unable to work out the true cost.  */
 	  *cost = rtx_cost (SET_DEST (x), SET, 0, speed_p);
-	  mode = GET_MODE (SET_DEST (x));
 	  outer_code = SET;
 	  /* Slightly lower the cost of setting a core reg to a constant.
 	 This helps break up chains and allows for better scheduling.  */

[PATCH][ARM] Add CRC32 to the feature flags of Cortex-A53, A57

2014-01-09 Thread Kyrill Tkachov


Hi all,

The Cortex-A53 and Cortex-A57 processors support the CRC32 extensions to 
ARMv8-a, so we specify that in their definitions in arm-cores.def.
This also updates their big.LITTLE amalgamation and removes the redundant 
FL_THUMB_DIV and FL_ARM_DIV there since ARMv8-a already implies those flags.


Tested arm-none-eabi on a model.

Ok for trunk?

2014-01-09  Kyrylo Tkachov  

* config/arm/arm-cores.def (cortex-a53): Specify FL_CRC32.
(cortex-a57): Likewise.
(cortex-a57.cortex-a53): Likewise. Remove redundant flags.diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d961e25..1e97273 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -152,8 +152,8 @@ ARM_CORE("marvell-pj4", marvell_pj4, 
marvell_pj4,   7A,  FL_LDSCHED, 9e)
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,  7A,  FL_LDSCHED 
| FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
 
 /* V8 Architecture Processors */
-ARM_CORE("cortex-a53", cortexa53, cortexa53,   8A, FL_LDSCHED, cortex_a53)
-ARM_CORE("cortex-a57", cortexa57, cortexa15,   8A, FL_LDSCHED, cortex_a15)
+ARM_CORE("cortex-a53", cortexa53, cortexa53,   8A, FL_LDSCHED | FL_CRC32, 
cortex_a53)
+ARM_CORE("cortex-a57", cortexa57, cortexa15,   8A, FL_LDSCHED | FL_CRC32, 
cortex_a15)
 
 /* V8 big.LITTLE implementations */
-ARM_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A,  
FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
+ARM_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A,  
FL_LDSCHED | FL_CRC32, cortex_a15)

[PATCH][ARM] Fix arm_init_iwmmxt_builtins to handle only iwmmxt entries

2014-01-09 Thread Kyrill Tkachov


Hi all,

After my CRC32 intrinsics patch that added new entries into the bdesc_2arg 
table, the arm_init_iwmmxt_builtins function tries to iterate over them and 
blows up, causing an ICE when trying to compile with -mcpu=iwmmxt.


This patch fixes that by ignoring the non-iwmmxt entries in that table when 
initialising the iwmmxt builtins.



With this patch the gcc.target/arm/mmx-2.c comes back and PASSes.

Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill


2014-01-09  Kyrylo Tkachov  

* config/arm/arm.c (arm_init_iwmmxt_builtins): Skip
non-iwmmxt builtins.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c8bf7c1..842d67f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -24244,7 +24244,7 @@ arm_init_iwmmxt_builtins (void)
   enum machine_mode mode;
   tree type;
 
-  if (d->name == 0)
+  if (d->name == 0 || !(d->mask == FL_IWMMXT || d->mask == FL_IWMMXT2))
 	continue;
 
   mode = insn_data[d->icode].operand[1].mode;

Re: [PATCH, go]: Skip some go tests

2014-01-09 Thread Uros Bizjak

On Thu, Jan 9, 2014 at 4:01 PM, Ian Lance Taylor  wrote:
> On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak  wrote:
>>
>> 2014-01-09  Uros Bizjak  
>>
>> * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems
>> which don't support -fsplit-stack.  Skip rotate[0123]-out.go.
>
> This is OK.  Thanks.
>
> You might want to tweak the comment just under where you added
> "peano.go".  Then go ahead and commit.

Actually, we don't even have to compile/execute generator file, and
included rotate.go is skipped due to "// skip" in its test line.

Attached patch was committed to mainline after re-test on x86_64-pc-linux-gnu.

Uros.
Index: go.test/go-test.exp
===
--- go.test/go-test.exp (revision 206468)
+++ go.test/go-test.exp (working copy)
@@ -400,17 +400,16 @@
}
 
if { ( [file tail $test] == "select2.go" \
-  || [file tail $test] == "stack.go" ) \
+  || [file tail $test] == "stack.go" \
+  || [file tail $test] == "peano.go" ) \
 && ! [check_effective_target_split_stack] } {
-   # chan/select2.go fails on targets without split stack,
-   # because they allocate a large stack segment that blows
-   # out the memory calculations.
+   # These tests fails on targets without split stack.
untested $name
continue
}
 
-   if { [file tail $test] == "rotate.go" } {
-   # This test produces a temporary file that takes too long
+   if [string match "*go.test/test/rotate\[0123\].go" $test] {
+   # These tests produces a temporary file that takes too long
# to compile--5 minutes on my laptop without optimization.
# When compiling without optimization it tests nothing
# useful, since the point of the test is to see whether

[PATCH][testsuite][ARM] Properly figure -mfloat-abi option for crypto tests

2014-01-09 Thread Kyrill Tkachov


Hi all,

When adding the testsuite options for the crypto tests we need to make sure that 
don't end up adding -mfloat-abi=softfp to a hard-float target like 
arm-none-linux-gnueabihf. This patch adds that code to figure out which 
-mfpu/-mfloat-abi combination to use in a similar approach to the NEON tests.


This patch addresses the same failures that Christophe mentioned in 
http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00375.html
but with this patch we can get those tests to PASS on arm-none-linux-gnueabihf 
instead of being just UNSUPPORTED.


Tested arm-none-linux-gnueabihf and arm-none-eabi.

Ok for trunk?

Thanks,
Kyrill


2014-01-09  Kyrylo Tkachov  

* lib/target-supports.exp
(check_effective_target_arm_crypto_ok_nocache): New.
(check_effective_target_arm_crypto_ok): Use above procedure.
(add_options_for_arm_crypto): Use et_arm_crypto_flags.diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 5166679..f1f4024 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2301,19 +2301,37 @@ proc check_effective_target_arm_unaligned { } {
 }
 
 # Return 1 if this is an ARM target supporting -mfpu=crypto-neon-fp-armv8
-# -mfloat-abi=softfp.
-proc check_effective_target_arm_crypto_ok {} {
+# -mfloat-abi=softfp or equivalent options.  Some multilibs may be
+# incompatible with these options.  Also set et_arm_crypto_flags to the
+# best options to add.
+
+proc check_effective_target_arm_crypto_ok_nocache { } {
+global et_arm_crypto_flags
+set et_arm_crypto_flags ""
 if { [check_effective_target_arm32] } {
-	return [check_no_compiler_messages arm_crypto_ok object {
-	  int foo (void)
-	  {
-	 __asm__ volatile ("aese.8 q0, q0");
-	 return 0;
-	  }
-	} "-mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp"]
-} else {
-	return 0
+	foreach flags {"" "-mfloat-abi=softfp" "-mfpu=crypto-neon-fp-armv8" "-mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp"} {
+	if { [check_no_compiler_messages_nocache arm_crypto_ok object {
+		#include "arm_neon.h"
+		uint8x16_t
+		foo (uint8x16_t a, uint8x16_t b)
+		{
+	  return vaeseq_u8 (a, b);
+		}
+	} "$flags"] } {
+		set et_arm_crypto_flags $flags
+		return 1
+	}
+	}
 }
+
+return 0
+}
+
+# Return 1 if this is an ARM target supporting -mfpu=crypto-neon-fp-armv8
+
+proc check_effective_target_arm_crypto_ok { } {
+return [check_cached_effective_target arm_crypto_ok \
+		check_effective_target_arm_crypto_ok_nocache]
 }
 
 # Add options for crypto extensions.
@@ -2321,7 +2339,8 @@ proc add_options_for_arm_crypto { flags } {
 if { ! [check_effective_target_arm_crypto_ok] } {
 return "$flags"
 }
-return "$flags -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp"
+global et_arm_crypto_flags
+return "$flags $et_arm_crypto_flags"
 }
 
 # Add the options needed for NEON.  We need either -mfloat-abi=softfp

Re: [Patch,testsuite] Fix testcases that use bind_pic_locally

2014-01-09 Thread Vidya Praveen

On Wed, Jan 08, 2014 at 12:28:56PM +, Jakub Jelinek wrote:
> On Wed, Jan 08, 2014 at 11:49:08AM +, Vidya Praveen wrote:
> > On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote:
> > > On Dec 17, 2013, at 6:06 AM, Vidya Praveen  wrote:
> > > > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by
> > > > default [1][2].
> > > 
> > > Let's give Jakub 2 days to weigh in?  If no objections, Ok, though, do 
> > > see about adding documentation for it.  
> > 
> > Sure. I didn't respin the patch with documentation since I wanted to know
> > if the solution is acceptable. If this patch is OK, I'll respin with the
> > documentation for bind_pic_locally_ok. 
> > 
> > > I kinda would like a simpler interface for these two, but?  that can be 
> > > follow on work, if someone has a bright idea and some time to implement 
> > > it.
> > > 
> > 
> > Could you explain what do you mean by simpler interface here? 
> 
> The simpler interface, as I said earlier, would be just to make sure
> /* { dg-add-options bind_pic_locally } */
> does the right thing, I really don't believe you've tried hard enough.
> 
> It is true dejagnu's default_target_compile has:
> if {[board_info $dest exists multilib_flags]} {
> append add_flags " [board_info $dest multilib_flags]"
> }
> last (before just adding -o $destfile; is multilib_flags where the
> -fpic/-fPIC comes in, right?), but if say dg-add-options bind_pic_locally
> adds the necessary options not to dg-extra-tools-flags, but to some
> other variable and say gcc_target_compile (and g++_target_compile)
> around the [target_compile ...] invocation e.g. temporarily append
> that other variable (if not empty) to board_info's multilib_flags
> and afterwards remove it, I don't see why it wouldn't work.
> Tcl is quite flexible in this.

Thanks Jakub. I seem to have not properly understood your earlier email. I could
do this and works fine. I'll test and post the patch.

VP.

[PATCH, committed] Fix for PR 59094

2014-01-09 Thread Iyer, Balaji V

Hello Everyone,
The following patch will fix the bug in PR 59094. The main issue was 
that version specific libraries are not stored in the correct location. The 
patch below should fix that. It is committed since the person who filed the bug 
has confirmed that the fix works.

Index: libcilkrts/Makefile.in
===
--- libcilkrts/Makefile.in  (revision 206468)
+++ libcilkrts/Makefile.in  (working copy)
@@ -401,7 +401,8 @@
-no-undefined

 # C/C++ header files for Cilk.
-cilkincludedir = $(includedir)/cilk
+# cilkincludedir = $(includedir)/cilk
+cilkincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/cilk
 cilkinclude_HEADERS = \
   include/cilk/cilk_api.h  \
   include/cilk/cilk_api_linux.h\
Index: libcilkrts/ChangeLog
===
--- libcilkrts/ChangeLog(revision 206468)
+++ libcilkrts/ChangeLog(working copy)
@@ -1,3 +1,10 @@
+2014-01-09  Balaji V. Iyer  
+
+   bootstrap/59094
+   * Makefile.am (cilkincludedir): Fixed a bug to store version-specific
+   runtime libraries in the correct place.
+   * Makefile.in: Regenerate.
+
 2013-12-13  Balaji V. Iyer  

* Makefile.am (GENERAL_FLAGS): Removed undefining of Cilk keywords.
Index: libcilkrts/Makefile.am
===
--- libcilkrts/Makefile.am  (revision 206468)
+++ libcilkrts/Makefile.am  (working copy)
@@ -108,7 +108,8 @@
 libcilkrts_la_LDFLAGS += -no-undefined

 # C/C++ header files for Cilk.
-cilkincludedir = $(includedir)/cilk
+# cilkincludedir = $(includedir)/cilk
+cilkincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/cilk
 cilkinclude_HEADERS =  \
   include/cilk/cilk_api.h  \
   include/cilk/cilk_api_linux.h\

Thanks,

Balaji V. Iyer.

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-09 Thread Richard Henderson

On 01/09/2014 08:35 AM, Jakub Jelinek wrote:
> That would be fine for 1), but would mean 2).  It is also fine to GC
> allocate each structure individually, but some (like bb_reorder) are say
> just 4 bytes long, so it might be overkill.

Hmm..  Perhaps define the whole structure as you do, but somewhere global
enough that ggc-page.c can see it, and add to the extra_order_size_table?
I don't know how much memory wastage there would be there, but I can't imagine
it's as much as 0.5MB.

r~

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 08:25:31AM -0800, Richard Henderson wrote:
> > +  p = (struct target_globals_extra *)
> > +  ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
> > +  PASS_MEM_STAT);
> > +  g = (struct target_globals *) p;
> > +  g->flag_state = &p->flag_state;
> > +  g->regs = &p->regs;
> >g->rtl = ggc_alloc_cleared_target_rtl ();
> 
> So, we're relying on something pointing to G, thus keeping the whole P alive?
> I suppose that works but it's fairly ugly that's for sure.

The separate structures aren't really installed individually, they are
always installed together through restore_target_globals.  As long as the
any FUNCTION_DECL with such TARGET_OPTION_NODE exists, it will be reachable.

The reason why it needs to be GC is:
1) in two of these target_* structures there are embedded rtxes etc. the GC
needs to see
2) if all FUNCTION_DECL with such combination of target attributes are GCed,
we'd leak memory

> As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to
> do something sensible with these high-order allocations, or we can do nearly
> this same trick only with libc.  I.e.
> 
>   struct target_globals_extra {
> struct target_flag_state flag_state;
> struct target_regs regs;
> struct target_hard_regs hard_regs;
> struct target_reload reload;
> struct target_expmed expmed;
> struct target_optabs optabs;
> struct target_cfgloop cfgloop;
> struct target_ira ira;
> struct target_ira_int ira_int;
> struct target_lra_int lra_int;
> struct target_builtins builtins;
> struct target_gcse gcse;
> struct target_bb_reorder bb_reorder;
> struct target_lower_subreg lower_subreg;
>   } *p;
> 
>   g = ggc_alloc_target_globals ();
>   p = XCNEW (target_globals_extra);
>   ...

That would be fine for 1), but would mean 2).  It is also fine to GC
allocate each structure individually, but some (like bb_reorder) are say
just 4 bytes long, so it might be overkill.

As noted by Richard S., IRA/LRA still puts pointers to heap allocated
objects into some of the structures, so there is some leak anyway, but not
as big.

Jakub

[PING^2] [PATCH]SIMD-Enabled functions for C++

2014-01-09 Thread Iyer, Balaji V

Hello Jakub,
Did you get a chance to look at this patch 
(http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00116.html)? I think I have fixed 
all the changes you requested. Is it ok for trunk?

Thanks,

Balaji V. Iyer.

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-09 Thread Richard Henderson

On 01/08/2014 10:34 AM, Jakub Jelinek wrote:
>struct target_globals *g;
> -
> -  g = ggc_alloc_target_globals ();
> -  g->flag_state = XCNEW (struct target_flag_state);
> -  g->regs = XCNEW (struct target_regs);
> +  struct target_globals_extra {
> +struct target_globals g;
> +struct target_flag_state flag_state;
> +struct target_regs regs;
> +struct target_hard_regs hard_regs;
> +struct target_reload reload;
> +struct target_expmed expmed;
> +struct target_optabs optabs;
> +struct target_cfgloop cfgloop;
> +struct target_ira ira;
> +struct target_ira_int ira_int;
> +struct target_lra_int lra_int;
> +struct target_builtins builtins;
> +struct target_gcse gcse;
> +struct target_bb_reorder bb_reorder;
> +struct target_lower_subreg lower_subreg;
> +  } *p;
> +  p = (struct target_globals_extra *)
> +  ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
> +PASS_MEM_STAT);
> +  g = (struct target_globals *) p;
> +  g->flag_state = &p->flag_state;
> +  g->regs = &p->regs;
>g->rtl = ggc_alloc_cleared_target_rtl ();

So, we're relying on something pointing to G, thus keeping the whole P alive?
I suppose that works but it's fairly ugly that's for sure.

As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to
do something sensible with these high-order allocations, or we can do nearly
this same trick only with libc.  I.e.

  struct target_globals_extra {
struct target_flag_state flag_state;
struct target_regs regs;
struct target_hard_regs hard_regs;
struct target_reload reload;
struct target_expmed expmed;
struct target_optabs optabs;
struct target_cfgloop cfgloop;
struct target_ira ira;
struct target_ira_int ira_int;
struct target_lra_int lra_int;
struct target_builtins builtins;
struct target_gcse gcse;
struct target_bb_reorder bb_reorder;
struct target_lower_subreg lower_subreg;
  } *p;

  g = ggc_alloc_target_globals ();
  p = XCNEW (target_globals_extra);
  ...


r~

Re: a patch prototype for PR59535 (THUMB code size regression)

2014-01-09 Thread Vladimir Makarov


On 1/9/2014, 10:30 AM, Richard Earnshaw wrote:

On 09/01/14 15:21, Vladimir Makarov wrote:

Hi, Richard.

This week I've been working on THUMB code size issues.  Here is the
prototype of the patch for spilling into HI_REGS instead of memory.
The patch decreases number of generated insns and makes the code faster
as it removes a lot of loads/stores.

I am sending the patch for your evaluation and for getting your
opinion.  If you like the code size results, I could create the real
patch next week (the patch here will not work correctly when a user
defines fixed registers by himself).

Thanks in advance, Vlad.



Do you need to take into account HARD_REGNO_NREGS (mode) when doing the
limit check?




In this patch only SImode is permitted.  The hooks also will be 
different in the final version of the patch.

Fix ipa-devirt ICE on virtual inheritance

2014-01-09 Thread Jan Hubicka

Hi,
this patch fixes IPA-devirt testcase that gave me bad sleep for months.  The
problem turned out to be combination of three issues (that greatly confused
me). This patch fixes first two.  Here representation of BINFOs of multiple
inheritnace actually differs from my mental modem. For diamond shaped graph
A
   / \
  B   C
   \ /
D

here A is a common virtual base of B and C.  I assumed that there will be two
binfos representing A linked from binfos representing B/C both pointing to
virtual table of A. This did not work so I assumed that there is one shared
binfo.  In reality we however have two binfos but only first one has vtable
associated.

Second issue, also addressed in this patch is lookup of corresponding vtable.
I copied code from get_binfo_at_offset that dives into the structure and
tracks last base that has non-zero offset. This is becaus vtables of bases
starting at same offset are shared.

This however does not work with multiple inheritance.  A may have same
offset 0, while we may reach it over C that has non-zero offset.
In this case we really want D's vtable instead of C's.

So instead of tracking one vtable I now maintain stack where one
can look up corresponding base. Alternative is to mimick what
get_binfo_at_offset does by walking fields instead of bases.
Here the walk would bypass B/C and get dirrectly to A, but 
then I would have difficulties to lookup the A's binfo.

Final alternative is to use BINFO_INHERITANCE_CHAIN same way as C++ FE,
but we do not stream it and I would like to avoid using it unless really
necessary.

Bootstrapped/regtested ppc64-linux.

Honza

PR ipa/58252
PR ipa/59226
* ipa-devirt.c record_target_from_binfo): Take as argument
stack of binfos and lookup matching one for virtual inheritance.
(possible_polymorphic_call_targets_1): Update.

* g++.dg/ipa/devirt-20.C: New testcase. 
* g++.dg/torture/pr58252.C: Likewise.
* g++.dg/torture/pr59226.C: Likewise.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 206362)
+++ ipa-devirt.c(working copy)
@@ -614,10 +614,8 @@ maybe_record_node (vec  &
This match what get_binfo_at_offset does, but with offset
being unknown.
 
-   TYPE_BINFO is binfo holding an virtual table matching
-   BINFO's type.  In the case of single inheritance, this
-   is binfo of BINFO's type ancestor (vtable is shared),
-   otherwise it is binfo of BINFO's type.
+   TYPE_BINFOS is a stack of BINFOS of types with defined
+   virtual table seen on way from class type to BINFO.
 
MATCHED_VTABLES tracks virtual tables we already did lookup
for virtual function in. INSERTED tracks nodes we already
@@ -630,7 +628,7 @@ static void
 record_target_from_binfo (vec  &nodes,
  tree binfo,
  tree otr_type,
- tree type_binfo,
+ vec  &type_binfos,
  HOST_WIDE_INT otr_token,
  tree outer_type,
  HOST_WIDE_INT offset,
@@ -642,10 +640,32 @@ record_target_from_binfo (vec = 0; i--)
+   if (BINFO_OFFSET (type_binfos[i]) == BINFO_OFFSET (binfo))
+ {
+   type_binfo = type_binfos[i];
+   break;
+ }
+  if (BINFO_VTABLE (binfo))
+   type_binfos.pop ();
+  /* If this is duplicated BINFO for base shared by virtual inheritance,
+we may not have its associated vtable.  This is not a problem, since
+we will walk it on the other path.  */
+  if (!type_binfo)
+   {
+ gcc_assert (BINFO_VIRTUAL_P (binfo));
+ return;
+   }
   tree inner_binfo = get_binfo_at_offset (type_binfo,
  offset, otr_type);
   /* For types in anonymous namespace first check if the respective vtable
@@ -676,12 +696,11 @@ record_target_from_binfo (vec type);
   unsigned int i;
+  vec  type_binfos = vNULL;
 
-  record_target_from_binfo (nodes, binfo, otr_type, binfo, otr_token,
+  record_target_from_binfo (nodes, binfo, otr_type, type_binfos, otr_token,
outer_type, offset,
inserted, matched_vtables,
type->anonymous_namespace);
+  type_binfos.release ();
   for (i = 0; i < type->derived_types.length (); i++)
 possible_polymorphic_call_targets_1 (nodes, inserted, 
 matched_vtables,
Index: testsuite/g++.dg/ipa/devirt-20.C
===
--- testsuite/g++.dg/ipa/devirt-20.C(revision 0)
+++ testsuite/g++.dg/ipa/devirt-20.C(working copy)
@@ -0,0 +1,31 @@
+#include 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-release_ssa"  } */
+namespace {
+struct A
+{ int a; virtual int foo() {return a;} void bar() {a=7;} };
+struct B
+{ int b; virtual int foo2() {r

Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)

2014-01-09 Thread Richard Henderson

On 01/08/2014 04:45 AM, Jakub Jelinek wrote:
> 2014-01-07  Jakub Jelinek  
> 
>   PR target/58115
>   * tree-core.h (struct target_globals): New forward declaration.
>   (struct tree_target_option): Add globals field.
>   * tree.h (TREE_TARGET_GLOBALS): Define.
>   (prepare_target_option_nodes_for_pch): New prototype.
>   * target-globals.h (struct target_globals): Define even if
>   !SWITCHABLE_TARGET.
>   * tree.c (prepare_target_option_node_for_pch,
>   prepare_target_option_nodes_for_pch): New functions.
>   * config/i386/i386.h (SWITCHABLE_TARGET): Define.
>   * config/i386/i386.c: Include target-globals.h.
>   (ix86_set_current_function): Instead of doing target_reinit
>   unconditionally, use save_target_globals_default_opts and
>   restore_target_globals.
> c-family/
>   * c-pch.c (c_common_write_pch): Call
>   prepare_target_option_nodes_for_pch.

Ok.


r~

Re: [PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types

2014-01-09 Thread Richard Henderson

On 01/08/2014 12:39 PM, Joseph S. Myers wrote:
> On Wed, 8 Jan 2014, Richard Henderson wrote:
> 
>> diff --git a/libgcc/soft-fp/soft-fp.h b/libgcc/soft-fp/soft-fp.h
>> index 696fc86..b54b1ed 100644
>> --- a/libgcc/soft-fp/soft-fp.h
>> +++ b/libgcc/soft-fp/soft-fp.h
>> @@ -237,6 +237,11 @@ typedef int DItype __attribute__ ((mode (DI)));
>>  typedef unsigned int UQItype __attribute__ ((mode (QI)));
>>  typedef unsigned int USItype __attribute__ ((mode (SI)));
>>  typedef unsigned int UDItype __attribute__ ((mode (DI)));
>> +#if _FP_W_TYPE_SIZE == 64
>> +typedef int TItype __attribute__ ((mode (TI)));
>> +typedef unsigned int UTItype __attribute__ ((mode (TI)));
>> +#endif
> 
> This isn't the right conditional.  _FP_W_TYPE_SIZE is ultimately an 
> optimization choice and need not be related to whether any TImode 
> functions are being defined using soft-fp, or whether TImode is supported 
> at all.  I think the most you can do is have sfp-machine.h define a macro 
> to say that TImode should be supported in soft-fp, rather than actually 
> defining the types itself.

The documentation for longlong.h say we must have a double-word type defined.
Given how easy it is to support a double-word type...

> 
> (If someone were to use soft-fp on hppa64, then they might well use 
> _FP_W_TYPE_SIZE == 64, but hppa64 doesn't support TImode.)
> 

... I can't imagine that this is anything but a bug.  Not that anyone seems to
be doing any hppa work at all these past years.


r~

Re: [PATCH i386 4/8] [AVX512] [6/8] Add substed patterns: `sae' subst.

2014-01-09 Thread H.J. Lu

On Wed, Dec 18, 2013 at 5:02 AM, Kirill Yukhin  wrote:
> Hello,
>
> On 02 Dec 16:10, Kirill Yukhin wrote:
>> Hello,
>> On 19 Nov 12:11, Kirill Yukhin wrote:
>> > Hello,
>> > On 15 Nov 20:07, Kirill Yukhin wrote:
>> > > > Is it ok for trunk?
>> > > Ping.
>> > Ping.
>> Ping.
> Ping.
>
> Rebased patch in the bottom.
>

This patch caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59733

Now bootstrap-asan failed with 24GB RAM and is almost unusable.

H.J.

Re: [PATCH, AArch64 6/6] aarch64: Define add_ssaaaa, sub_ddmmss, umul_ppmm

2014-01-09 Thread Yufeng Zhang


Hi,

This patch and the preceding aarch64.md patches all look good to me, but 
I cannot approve it.


Thanks for adding the support for these missing patterns and defines!

Yufeng

On 01/08/14 18:13, Richard Henderson wrote:

We have good support for TImode arithmetic, so no need to do anything
with inline assembly.

include/
* longlong.h [__aarch64__] (add_ss, sub_ddmmss, umul_ppmm): New.
[__aarch64__] (COUNT_LEADING_ZEROS_0): Define in terms of W_TYPE_SIZE.
---
  include/longlong.h | 28 ++--
  1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/include/longlong.h b/include/longlong.h
index b4c1f400..1b11fc7 100644
--- a/include/longlong.h
+++ b/include/longlong.h
@@ -123,19 +123,35 @@ extern const UQItype __clz_tab[256] attribute_hidden;
  #endif /* __GNUC__<  2 */

  #if defined (__aarch64__)
+#define add_ss(sh, sl, ah, al, bh, bl) \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(ah)<<  64 | (UWtype)(al);\
+__x += (UDWtype)(UWtype)(bh)<<  64 | (UWtype)(bl);   \
+(sh) = __x>>  W_TYPE_SIZE;   \
+(sl) = __x;
\
+  } while (0)
+#define sub_ddmmss(sh, sl, ah, al, bh, bl) \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(ah)<<  64 | (UWtype)(al);\
+__x -= (UDWtype)(UWtype)(bh)<<  64 | (UWtype)(bl);   \
+(sh) = __x>>  W_TYPE_SIZE;   \
+(sl) = __x;
\
+  } while (0)
+#define umul_ppmm(ph, pl, m0, m1)  \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(m0) * (UWtype)(m1);
\
+(ph) = __x>>  W_TYPE_SIZE;   \
+(pl) = __x;
\
+  } while (0)

+#define COUNT_LEADING_ZEROS_0   W_TYPE_SIZE
  #if W_TYPE_SIZE == 32
  #define count_leading_zeros(COUNT, X) ((COUNT) = __builtin_clz (X))
  #define count_trailing_zeros(COUNT, X)   ((COUNT) = __builtin_ctz (X))
-#define COUNT_LEADING_ZEROS_0 32
-#endif /* W_TYPE_SIZE == 32 */
-
-#if W_TYPE_SIZE == 64
+#elif W_TYPE_SIZE == 64
  #define count_leading_zeros(COUNT, X) ((COUNT) = __builtin_clzll (X))
  #define count_trailing_zeros(COUNT, X)   ((COUNT) = __builtin_ctzll (X))
-#define COUNT_LEADING_ZEROS_0 64
  #endif /* W_TYPE_SIZE == 64 */
-
  #endif /* __aarch64__ */

  #if defined (__alpha)&&  W_TYPE_SIZE == 64

Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS

2014-01-09 Thread Richard Sandiford

Tom de Vries  writes:
> On 25/12/13 14:02, Tom de Vries wrote:
>> On 07-12-13 16:07, Tom de Vries wrote:
>>> Richard,
>>>
>>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
>>> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
>>> address the issue that $6 is sometimes used in split calls.
>>>
>>> Build and reg-tested on MIPS.
>>>
>>> OK for stage1?
>>>
>> 
>
> Richard,
>
> Ping.
>
> This patch is the only part of -fuse-caller-save that still needs approval.

Hmm, where were parts 4 and 6 approved?  Was looking for the discussion
in the hope that it would answer the question I don't really understand,
which is: this hook is only used during final, is that right?  And the
clobber that you're adding is exposed at the rtl level.  So why do we
need the hook at all?  Why not just collect the usage information at
the end of final rather than at the beginning, so that all splits during
final have been done?  For other cases (where the usage isn't explicit
at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
instead?

Thanks,
Richard

Re: a patch prototype for PR59535 (THUMB code size regression)

2014-01-09 Thread Richard Earnshaw

On 09/01/14 15:21, Vladimir Makarov wrote:
> Hi, Richard.
> 
>This week I've been working on THUMB code size issues.  Here is the
> prototype of the patch for spilling into HI_REGS instead of memory.
> The patch decreases number of generated insns and makes the code faster
> as it removes a lot of loads/stores.
> 
>I am sending the patch for your evaluation and for getting your
> opinion.  If you like the code size results, I could create the real 
> patch next week (the patch here will not work correctly when a user 
> defines fixed registers by himself).
> 
> Thanks in advance, Vlad.
> 

Do you need to take into account HARD_REGNO_NREGS (mode) when doing the
limit check?

R.

> 
> z
> 
> 
> Index: config/arm/arm.c
> ===
> --- config/arm/arm.c  (revision 206089)
> +++ config/arm/arm.c  (working copy)
> @@ -73,6 +73,8 @@ struct four_ints
>  
>  /* Forward function declarations.  */
>  static bool arm_lra_p (void);
> +static reg_class_t arm_spill_class (reg_class_t, enum machine_mode);
> +static int arm_spill_hard_regno (int, reg_class_t, enum machine_mode);
>  static bool arm_needs_doubleword_align (enum machine_mode, const_tree);
>  static int arm_compute_static_chain_stack_bytes (void);
>  static arm_stack_offsets *arm_get_frame_offsets (void);
> @@ -345,6 +347,12 @@ static const struct attribute_spec arm_a
>  #undef TARGET_LRA_P
>  #define TARGET_LRA_P arm_lra_p
>  
> +#undef TARGET_SPILL_CLASS
> +#define TARGET_SPILL_CLASS arm_spill_class
> +
> +#undef TARGET_SPILL_HARD_REGNO
> +#define TARGET_SPILL_HARD_REGNO arm_spill_hard_regno
> +
>  #undef  TARGET_ATTRIBUTE_TABLE
>  #define TARGET_ATTRIBUTE_TABLE arm_attribute_table
>  
> @@ -5597,6 +5605,28 @@ arm_lra_p (void)
>return arm_lra_flag;
>  }
>  
> +/* Return class of registers which could be used for pseudo of MODE
> +   and of class RCLASS for spilling instead of memory.  Return NO_REGS
> +   if it is not possible or non-profitable.  */
> +static reg_class_t
> +arm_spill_class (reg_class_t rclass, enum machine_mode mode)
> +{
> +  if (TARGET_THUMB1 && mode == SImode
> +  && (rclass == LO_REGS || rclass == GENERAL_REGS))
> +return HI_REGS;
> +  return NO_REGS;
> +}
> +
> +/* ???  */
> +static int
> +arm_spill_hard_regno (int n, reg_class_t spill_class, enum machine_mode mode)
> +{
> +  gcc_assert (TARGET_THUMB1 && mode == SImode && spill_class == HI_REGS
> +   && n >= 0);
> +  int hard_regno = FIRST_HI_REGNUM + n;
> +  return hard_regno > 12 ? -1 : hard_regno;
> +}
> +
>  /* Return true if mode/type need doubleword alignment.  */
>  static bool
>  arm_needs_doubleword_align (enum machine_mode mode, const_tree type)
> @@ -29236,6 +29266,7 @@ arm_conditional_register_usage (void)
>for (regno = FIRST_HI_REGNUM;
>  regno <= LAST_HI_REGNUM; ++regno)
>   fixed_regs[regno] = call_used_regs[regno] = 1;
> +  fixed_regs[12] = call_used_regs[12] = 1;
>  }
>  
>/* The link register can be clobbered by any branch insn,
> Index: doc/tm.texi
> ===
> --- doc/tm.texi   (revision 206089)
> +++ doc/tm.texi   (working copy)
> @@ -2918,6 +2918,10 @@ A target hook which returns true if an a
>  This hook defines a class of registers which could be used for spilling  
> pseudos of the given mode and class, or @code{NO_REGS} if only memory  should 
> be used.  Not defining this hook is equivalent to returning  @code{NO_REGS} 
> for all inputs.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} int TARGET_SPILL_HARD_REGNO (int, 
> @var{reg_class_t}, enum @var{machine_mode})
> +This hook defines n-th (0, ...) register which could be used for spilling  
> pseudos of the given mode and spill class, or -1 if there are no  such regs 
> anymore.  The hook shoul be defined with spill_class hook  and should be 
> defined only for classes returned by spill_class.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} {enum machine_mode} TARGET_CSTORE_MODE (enum 
> insn_code @var{icode})
>  This hook defines the machine mode to use for the boolean result of  
> conditional store patterns.  The ICODE argument is the instruction code  for 
> the cstore being performed.  Not definiting this hook is the same  as 
> accepting the mode encoded into operand 0 of the cstore expander  patterns.
>  @end deftypefn
> Index: doc/tm.texi.in
> ===
> --- doc/tm.texi.in(revision 206089)
> +++ doc/tm.texi.in(working copy)
> @@ -2549,6 +2549,8 @@ as below:
>  
>  @hook TARGET_SPILL_CLASS
>  
> +@hook TARGET_SPILL_HARD_REGNO
> +
>  @hook TARGET_CSTORE_MODE
>  
>  @node Old Constraints
> Index: lra-spills.c
> ===
> --- lra-spills.c  (revision 206089)
> +++ lra-spills.c  (working copy)
> @@ -252,7 +252,7 @@ pseudo_reg_slot_compare (const void *v1p
>  static int
>  assign_spill_har

[Patch, Fortran] PR 58026: Bad error recovery for allocatable component of undeclared type

2014-01-09 Thread Janus Weil

Hi all,

the attached patch started out as an ICE-on-invalid regression fix,
but after the ICE had been fixed recently by other means, it was
degraded to a mere error-recovery improvement. It removes some rather
'hackish' code that was added by Paul quite a long time ago.

Regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus


2014-01-09  Janus Weil  

PR fortran/58026
* decl.c (gfc_match_data_decl): Improve error recovery.


2014-01-09  Janus Weil  

PR fortran/58026
* gfortran.dg/alloc_comp_basics_6.f90: New.
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c  (revision 206462)
+++ gcc/fortran/decl.c  (working copy)
@@ -4287,12 +4287,10 @@ gfc_match_data_decl (void)
  || current_ts.u.derived->attr.zero_comp))
goto ok;
 
-  /* Now we have an error, which we signal, and then fix up
-because the knock-on is plain and simple confusing.  */
   gfc_error_now ("Derived type at %C has not been previously defined "
 "and so cannot appear in a derived type definition");
-  current_attr.pointer = 1;
-  goto ok;
+  m = MATCH_ERROR;
+  goto cleanup;
 }
 
 ok:
! { dg-do compile }
!
! PR 58026: Bad error recovery for allocatable component of undeclared type
!
! Contributed by Joost VandeVondele 

  type sysmtx_t
 type(ext_complex_t), allocatable :: S(:)
  end type

end

Re: [patch] regcprop fix for PR rtl-optimization/54300

2014-01-09 Thread Richard Earnshaw

On 20/11/13 13:57, Richard Earnshaw wrote:
> On 19/11/13 17:48, Jeff Law wrote:
>> On 11/19/13 10:32, Steven Bosscher wrote:
>>>
>>> Yes. In the GCC3 days it was important for sincos on i386, and on mk68
>>> it used to be important for some of the funnier patterns. Not sure if
>>> it's still useful today, though. Might be worth looking into, just to
>>> avoid the confusion in the future.
>> I doubt it's changed all that much :-)
>>
>>>
>>> There's been confusion about this before, where people assumed
>>> single_set really means "just one SET in this pattern". (ISTR fixing
>>> gcse.c's hash_scan_rtx for this at some point...?). But that's not the
>>> semantics of single_set.
>> Yes.  And I'd expect confusion to continue :(  Not sure if creating 
>> renaming to capture the actual semantics would help here.
>>
>>>
>>> The proper test for "just one SET" is (!multiple_sets && single_set).
>>> At least, that's how I've always coded it...
>> Seems reasonable for those cases where you have to ensure there really 
>> is just one set.
>>
>>
>> jeff
>>
> 
> Provided we correctly note the other values that are killed, we can
> handle multiple sets safely.  The one restriction we have to watch is
> where the dead set operations kill input values to the live set operation.
> 
> I've committed my patch to trunk.
> 
> I'll leave it to gestate a couple of days, but this is also needed on
> the active release branches as well.
> 

Well, a bit more than a few days...

4.8 backport has now been applied.  4.7 should follow shortly.

R.

a patch prototype for PR59535 (THUMB code size regression)

2014-01-09 Thread Vladimir Makarov


Hi, Richard.

  This week I've been working on THUMB code size issues.  Here is the
prototype of the patch for spilling into HI_REGS instead of memory.
The patch decreases number of generated insns and makes the code faster
as it removes a lot of loads/stores.

  I am sending the patch for your evaluation and for getting your
opinion.  If you like the code size results, I could create the real 
patch next week (the patch here will not work correctly when a user 
defines fixed registers by himself).


Thanks in advance, Vlad.
Index: config/arm/arm.c
===
--- config/arm/arm.c(revision 206089)
+++ config/arm/arm.c(working copy)
@@ -73,6 +73,8 @@ struct four_ints
 
 /* Forward function declarations.  */
 static bool arm_lra_p (void);
+static reg_class_t arm_spill_class (reg_class_t, enum machine_mode);
+static int arm_spill_hard_regno (int, reg_class_t, enum machine_mode);
 static bool arm_needs_doubleword_align (enum machine_mode, const_tree);
 static int arm_compute_static_chain_stack_bytes (void);
 static arm_stack_offsets *arm_get_frame_offsets (void);
@@ -345,6 +347,12 @@ static const struct attribute_spec arm_a
 #undef TARGET_LRA_P
 #define TARGET_LRA_P arm_lra_p
 
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS arm_spill_class
+
+#undef TARGET_SPILL_HARD_REGNO
+#define TARGET_SPILL_HARD_REGNO arm_spill_hard_regno
+
 #undef  TARGET_ATTRIBUTE_TABLE
 #define TARGET_ATTRIBUTE_TABLE arm_attribute_table
 
@@ -5597,6 +5605,28 @@ arm_lra_p (void)
   return arm_lra_flag;
 }
 
+/* Return class of registers which could be used for pseudo of MODE
+   and of class RCLASS for spilling instead of memory.  Return NO_REGS
+   if it is not possible or non-profitable.  */
+static reg_class_t
+arm_spill_class (reg_class_t rclass, enum machine_mode mode)
+{
+  if (TARGET_THUMB1 && mode == SImode
+  && (rclass == LO_REGS || rclass == GENERAL_REGS))
+return HI_REGS;
+  return NO_REGS;
+}
+
+/* ???  */
+static int
+arm_spill_hard_regno (int n, reg_class_t spill_class, enum machine_mode mode)
+{
+  gcc_assert (TARGET_THUMB1 && mode == SImode && spill_class == HI_REGS
+ && n >= 0);
+  int hard_regno = FIRST_HI_REGNUM + n;
+  return hard_regno > 12 ? -1 : hard_regno;
+}
+
 /* Return true if mode/type need doubleword alignment.  */
 static bool
 arm_needs_doubleword_align (enum machine_mode mode, const_tree type)
@@ -29236,6 +29266,7 @@ arm_conditional_register_usage (void)
   for (regno = FIRST_HI_REGNUM;
   regno <= LAST_HI_REGNUM; ++regno)
fixed_regs[regno] = call_used_regs[regno] = 1;
+  fixed_regs[12] = call_used_regs[12] = 1;
 }
 
   /* The link register can be clobbered by any branch insn,
Index: doc/tm.texi
===
--- doc/tm.texi (revision 206089)
+++ doc/tm.texi (working copy)
@@ -2918,6 +2918,10 @@ A target hook which returns true if an a
 This hook defines a class of registers which could be used for spilling  
pseudos of the given mode and class, or @code{NO_REGS} if only memory  should 
be used.  Not defining this hook is equivalent to returning  @code{NO_REGS} for 
all inputs.
 @end deftypefn
 
+@deftypefn {Target Hook} int TARGET_SPILL_HARD_REGNO (int, @var{reg_class_t}, 
enum @var{machine_mode})
+This hook defines n-th (0, ...) register which could be used for spilling  
pseudos of the given mode and spill class, or -1 if there are no  such regs 
anymore.  The hook shoul be defined with spill_class hook  and should be 
defined only for classes returned by spill_class.
+@end deftypefn
+
 @deftypefn {Target Hook} {enum machine_mode} TARGET_CSTORE_MODE (enum 
insn_code @var{icode})
 This hook defines the machine mode to use for the boolean result of  
conditional store patterns.  The ICODE argument is the instruction code  for 
the cstore being performed.  Not definiting this hook is the same  as accepting 
the mode encoded into operand 0 of the cstore expander  patterns.
 @end deftypefn
Index: doc/tm.texi.in
===
--- doc/tm.texi.in  (revision 206089)
+++ doc/tm.texi.in  (working copy)
@@ -2549,6 +2549,8 @@ as below:
 
 @hook TARGET_SPILL_CLASS
 
+@hook TARGET_SPILL_HARD_REGNO
+
 @hook TARGET_CSTORE_MODE
 
 @node Old Constraints
Index: lra-spills.c
===
--- lra-spills.c(revision 206089)
+++ lra-spills.c(working copy)
@@ -252,7 +252,7 @@ pseudo_reg_slot_compare (const void *v1p
 static int
 assign_spill_hard_regs (int *pseudo_regnos, int n)
 {
-  int i, k, p, regno, res, spill_class_size, hard_regno, nr;
+  int i, k, p, regno, res, hard_regno, nr;
   enum reg_class rclass, spill_class;
   enum machine_mode mode;
   lra_live_range_t r;
@@ -271,7 +271,7 @@ assign_spill_hard_regs (int *pseudo_regn
   /* Set up reserved hard regs for every program point. */
   reserved_hard_regs = XN

Re: wide-int, loop

2014-01-09 Thread Richard Biener

On Thu, Jan 2, 2014 at 5:27 AM, Mike Stump  wrote:
> On Nov 26, 2013, at 1:14 AM, Richard Biener  
> wrote:
 @@ -2662,8 +2661,8 @@ iv_number_of_iterations (struct loop *loop, rtx
 insn, rtx condition,
iv1.step = const0_rtx;
if (INTVAL (iv0.step) < 0)
 {
 - iv0.step = simplify_gen_unary (NEG, comp_mode, iv0.step, mode);
 - iv1.base = simplify_gen_unary (NEG, comp_mode, iv1.base, mode);
 + iv0.step = simplify_gen_unary (NEG, comp_mode, iv0.step,
 comp_mode);
 + iv1.base = simplify_gen_unary (NEG, comp_mode, iv1.base,
 comp_mode);
 }
iv0.step = lowpart_subreg (mode, iv0.step, comp_mode);

 separate bugfix?
>>>
>>> most likely.i will submit separately.
>>>
 @@ -1378,7 +1368,8 @@ decide_peel_simple (struct loop *loop, int flags)
/* If we have realistic estimate on number of iterations, use it.  */
if (get_estimated_loop_iterations (loop, &iterations))
  {
 -  if (double_int::from_shwi (npeel).ule (iterations))
 +  /* TODO: unsigned/signed confusion */
 +  if (wi::leu_p (npeel, iterations))
 {
   if (dump_file)
 {

 what does this refer to?  npeel is unsigned.
>>>
>>>
>>> it was the fact that they were doing the from_shwi and then using an
>>> unsigned test.
>>
>> Ah - probably a typo.  Please just remove the "TODO".
>
> Done:
>
> Index: loop-unroll.c
> ===
> --- loop-unroll.c   (revision 206183)
> +++ loop-unroll.c   (working copy)
> @@ -1371,7 +1371,6 @@ decide_peel_simple (struct loop *loop, i
>/* If we have realistic estimate on number of iterations, use it.  */
>if (get_estimated_loop_iterations (loop, &iterations))
>  {
> -  /* TODO: unsigned/signed confusion */
>if (wi::leu_p (npeel, iterations))
> {
>   if (dump_file)
>
 Otherwise looks good to me.
>
> Kenny hasn't yet integrated the first into trunk, but I'd like to ask anyway:
>
> Ok?

Ok.

Richard.

Re: wide-int, sched

2014-01-09 Thread Richard Biener

On Thu, Jan 2, 2014 at 5:53 AM, Mike Stump  wrote:
> On Nov 23, 2013, at 11:22 AM, Mike Stump  wrote:
>> Richi has asked the we break the wide-int patch so that the individual port 
>> and front end maintainers can review their parts without have to go through 
>> the entire patch.This patch covers the scheduler code.
>>
>> Ok?
>
> Ping?
>
> I promise, this one is easy…
>

Ok.

Richard.

Re: wide-int, ipa

2014-01-09 Thread Richard Biener

On Thu, Jan 2, 2014 at 5:12 AM, Mike Stump  wrote:
> On Nov 23, 2013, at 11:22 AM, Mike Stump  wrote:
>> Richi has asked the we break the wide-int patch so that the individual port 
>> and front end maintainers can review their parts without have to go through 
>> the entire patch.This patch covers the ipa code.
>>
>> Ok?
>
> Ping?
>
> I promise, this patch isn't frightening.  Small, easy to read and understand, 
> doesn't require an ipa expert.

Why

@@ -968,7 +968,7 @@ get_polymorphic_call_info (tree fndecl,
{
  base_pointer = TREE_OPERAND (base, 0);
  context->offset
-+= offset2 + mem_ref_offset (base).low * BITS_PER_UNIT;
+   += offset2 + mem_ref_offset (base).ulow () * BITS_PER_UNIT;
  context->outer_type = NULL;
}
  /* We found base object.  In this case the outer_type

but then

@@ -1063,7 +1063,7 @@ compute_complex_assign_jump_func (struct
ipa_node_params *info,
   || max_size == -1
   || max_size != size)
 return;
-  offset += mem_ref_offset (base).low * BITS_PER_UNIT;
+  offset += mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
   ssa = TREE_OPERAND (base, 0);
   if (TREE_CODE (ssa) != SSA_NAME
   || !SSA_NAME_IS_DEFAULT_DEF (ssa)

?  I think it should be to_short_addr () in the first case as well.

Ok with that change.

Richard.

>

Re: wide-int, gimple

2014-01-09 Thread Richard Biener

On Thu, Jan 2, 2014 at 5:10 AM, Mike Stump  wrote:
> On Nov 28, 2013, at 6:20 AM, Richard Biener  
> wrote:
>> On Thu, Nov 28, 2013 at 12:58 PM, Richard Sandiford
>>  wrote:
>>> Jakub Jelinek  writes:
 On Mon, Nov 25, 2013 at 12:24:30PM +0100, Richard Biener wrote:
> On Sat, Nov 23, 2013 at 8:21 PM, Mike Stump  wrote:
>> Richi has asked the we break the wide-int patch so that the
>> individual port and front end maintainers can review their parts
>> without have to go through the entire patch.  This patch covers the
>> gimple code.
>
> @@ -1754,7 +1754,7 @@ dump_ssaname_info (pretty_printer *buffer, tree
> node, int spc)
>   if (!POINTER_TYPE_P (TREE_TYPE (node))
>   && SSA_NAME_RANGE_INFO (node))
> {
> -  double_int min, max, nonzero_bits;
> +  widest_int min, max, nonzero_bits;
>   value_range_type range_type = get_range_info (node, &min, &max);
>
>   if (range_type == VR_VARYING)
>
> this makes me suspect you are changing SSA_NAME_RANGE_INFO
> to embed two max wide_ints.  That's a no-no.

 Well, the range_info_def struct right now contains 3 double_ints, which is
 unnecessary overhead for the most of the cases where the SSA_NAME's type
 has just at most HOST_BITS_PER_WIDE_INT bits and thus we could fit all 3 of
 them into 3 HOST_WIDE_INTs rather than 3 double_ints.  So supposedly struct
 range_info_def could be a template on the type's precision rounded up to 
 HWI
 bits, or say have 3 alternatives there, use
 FIXED_WIDE_INT (HOST_BITS_PER_WIDE_INT) for the smallest types,
 FIXED_WIDE_INT (2 * HOST_BITS_PER_WIDE_INT) aka double_int for the larger
 but still common ones, and widest_int for the rest, then the API to set/get
 it could use widest_int everywhere, and just what storage we'd use would
 depend on the precision of the type.
>>>
>>> This patch adds a trailing_wide_ints  that can be used at the end of
>>> a variable-length structure to store N wide_ints.  There's also a macro
>>> to declare get/set methods for each of the N elements.
>>>
>>> At the moment I've only defined non-const operator[].  It'd be possible
>>> to add a const version later if necessary.
>>>
>>> The size of range_info_def for precisions that fit in M HWIs is then
>>> 1 + 3 * M, so 4 for the common case (down from 6 on trunk).  The maximum
>>> is 7 for current x86_64 types (up from 6 on trunk).
>>>
>>> I wondered whether to keep the interface using widest_int, but I think
>>> wide_int works out more naturally.  The only caller that wants to extend
>>> beyond the precision is CCP, but that's already special because the upper
>>> bits are supposed to be set (i.e. it's not a normal sign or zero extension).
>>>
>>> This relies on the SSA_NAME_ANTI_RANGE_P patch I just posted.
>>>
>>> If this is OK I'll look at using the same structure elsewhere.
>>
>> Looks good to me.
>
> So, is that an Ok for the gimple patch and the follow on work?  Just double 
> checking.

Yes.

Re: wide-int, doc

2014-01-09 Thread Richard Biener

On Sat, Nov 23, 2013 at 8:21 PM, Mike Stump  wrote:
> Richi has asked the we break the wide-int patch so that the individual port 
> and front end maintainers can review their parts without have to go through 
> the entire patch.This patch covers the documentation.
>
> Ok?

Ok.

Thanks,
Richard.

Re: wide-int, build system

2014-01-09 Thread Richard Biener

On Sat, Nov 23, 2013 at 8:20 PM, Mike Stump  wrote:
> Richi has asked the we break the wide-int patch so that the individual port 
> and front end maintainers can review their parts without have to go through 
> the entire patch.This patch covers the build system (make).
>
> Ok?

Needs updating (no explicit dependences for wide-int.h) but ok.

Richard.

Re: [PATCH] Add zero-overhead looping for xtensa backend

2014-01-09 Thread Felix Yang

Hi Sterling,

Attached please find version 2 of the patch.

I applied this updated patch (with small adaptations) to gcc-4.8.2
and carried out some tests.
I can execute the testcases in a simulator, which support
zero-overhead looping instructions.

First of all, I can successfully build libgcc, libstdc++ and
newlibc for xtensa with this patch.
The newly built xtensa gcc also passed testsuite which comes with newlibc.
I also tested the cases under gcc/testsuite/gcc.c-torture/execute/
directory. There are about 800+ cases tested.
Test result shows no new failed case with this patch, compared
with the original gcc version.
Is that OK?

I also double checked the loop relaxation issue with binutils-2.24
(the latest version).
The result show that the assember can do loop relaxation when the
loop target is too far ( > 256 Byte).
And this is the reason why I don't check the size of the loop.


Index: gcc/ChangeLog
===
--- gcc/ChangeLog(revision 206463)
+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,18 @@
+2014-01-09  Felix Yang  
+
+* config/xtensa/xtensa.c (xtensa_reorg): New.
+(xtensa_reorg_loops): New.
+(xtensa_can_use_doloop_p): New.
+(xtensa_invalid_within_doloop): New.
+(hwloop_optimize): New.
+(hwloop_fail): New.
+(hwloop_pattern_reg): New.
+(xtensa_emit_loop_end): Modified to emit the zero-overhead loop end label.
+(xtensa_doloop_hooks): Define.
+* config/xtensa/xtensa.md (doloop_end): New.
+(zero_cost_loop_start): Rewritten.
+(zero_cost_loop_end): Rewritten.
+
 2014-01-09  Richard Biener  

 PR tree-optimization/59715
Index: gcc/config/xtensa/xtensa.md
===
--- gcc/config/xtensa/xtensa.md(revision 206463)
+++ gcc/config/xtensa/xtensa.md(working copy)
@@ -1,6 +1,7 @@
 ;; GCC machine description for Tensilica's Xtensa architecture.
 ;; Copyright (C) 2001-2014 Free Software Foundation, Inc.
 ;; Contributed by Bob Wilson (bwil...@tensilica.com) at Tensilica.
+;; Zero-overhead looping support by Felix Yang (fei.yang0...@gmail.com).

 ;; This file is part of GCC.

@@ -35,6 +36,8 @@
   (UNSPEC_TLS_CALL9)
   (UNSPEC_TP10)
   (UNSPEC_MEMW11)
+  (UNSPEC_LSETUP_START  12)
+  (UNSPEC_LSETUP_END13)

   (UNSPECV_SET_FP1)
   (UNSPECV_ENTRY2)
@@ -1289,41 +1292,67 @@
(set_attr "length""3")])


+;; Hardware loop support.
+
 ;; Define the loop insns used by bct optimization to represent the
-;; start and end of a zero-overhead loop (in loop.c).  This start
-;; template generates the loop insn; the end template doesn't generate
-;; any instructions since loop end is handled in hardware.
+;; start and end of a zero-overhead loop.  This start template generates
+;; the loop insn; the end template doesn't generate any instructions since
+;; loop end is handled in hardware.

 (define_insn "zero_cost_loop_start"
   [(set (pc)
-(if_then_else (eq (match_operand:SI 0 "register_operand" "a")
-  (const_int 0))
-  (label_ref (match_operand 1 "" ""))
-  (pc)))
-   (set (reg:SI 19)
-(plus:SI (match_dup 0) (const_int -1)))]
+(if_then_else (ne (match_operand:SI 0 "register_operand" "a")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+   (set (match_operand:SI 2 "register_operand" "+a0")
+(plus (match_dup 2)
+  (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_LSETUP_START)]
   ""
-  "loopnez\t%0, %l1"
+  "loop\t%0, %l1_LEND"
   [(set_attr "type""jump")
(set_attr "mode""none")
(set_attr "length""3")])

 (define_insn "zero_cost_loop_end"
   [(set (pc)
-(if_then_else (ne (reg:SI 19) (const_int 0))
-  (label_ref (match_operand 0 "" ""))
-  (pc)))
-   (set (reg:SI 19)
-(plus:SI (reg:SI 19) (const_int -1)))]
+(if_then_else (ne (match_operand:SI 0 "register_operand" "a")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+   (set (match_operand:SI 2 "register_operand" "+a0")
+(plus (match_dup 2)
+  (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_LSETUP_END)]
   ""
 {
-xtensa_emit_loop_end (insn, operands);
-return "";
+  xtensa_emit_loop_end (insn, operands);
+  return "";
 }
   [(set_attr "type""jump")
(set_attr "mode""none")
(set_attr "length""0")])

+; operand 0 is the loop count pseudo register
+; operand 1 is the label to jump to at the top of the loop
+(define_expand "doloop_end"
+  [(parallel [(set (pc) (if_then_else
+  (ne (match_operand:SI 0 "" "")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+

Re: [ARM] add armv7ve support

2014-01-09 Thread Renlin Li


Hi Gerald,

Sorry for the late reply!
We're working on a list of all the ARM-related changes in 4.9. This will 
also be included.


Kind regards,
Renlin

On 03/01/14 13:24, Gerald Pfeifer wrote:

Renlin Li  wrote:

Hi all,

This patch will add armv7ve support to gcc. Armv7ve is basically a
armv7-a architecture profile with Virtualization Extensions.

Mind adding this to the release notes?

Gerald

Re: [PATCH, go]: Skip some go tests

2014-01-09 Thread Ian Lance Taylor

On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak  wrote:
>
> 2014-01-09  Uros Bizjak  
>
> * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems
> which don't support -fsplit-stack.  Skip rotate[0123]-out.go.

This is OK.  Thanks.

You might want to tweak the comment just under where you added
"peano.go".  Then go ahead and commit.

Ian

Re: [Patch] Avoid gcc_assert in libgcov

2014-01-09 Thread Jan Hubicka

> As suggested by Honza, avoid bloating libgcov from gcc_assert by using
> a new macro gcov_nonruntime_assert in gcov-io.c that is only mapped to
> gcc_assert when not in libgcov.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
> 
> Thanks,
> Teresa
> 
> 2014-01-09  Teresa Johnson  
> 
> * gcov-io.c (gcov_position): Use gcov_nonruntime_assert.
> (gcov_is_error): Ditto.
> (gcov_rewrite): Ditto.
> (gcov_open): Ditto.
> (gcov_write_words): Ditto.
> (gcov_write_length): Ditto.
> (gcov_read_words): Ditto.
> (gcov_read_summary): Ditto.
> (gcov_sync): Ditto.
> (gcov_seek): Ditto.
> (gcov_histo_index): Ditto.
> (static void gcov_histogram_merge): Ditto.
> (compute_working_sets): Ditto.
> * gcov-io.h (gcov_nonruntime_assert): Define.
> 

> @@ -481,14 +481,14 @@ gcov_read_words (unsigned words)
>const gcov_unsigned_t *result;
>unsigned excess = gcov_var.length - gcov_var.offset;
> 
> -  gcc_assert (gcov_var.mode > 0);
> +  gcov_nonruntime_assert (gcov_var.mode > 0);
>if (excess < words)
>  {
>gcov_var.start += gcov_var.offset;
>  #if IN_LIBGCOV
>if (excess)
> {
> - gcc_assert (excess == 1);
> + gcov_nonruntime_assert (excess == 1);

It probably makes no sense to put nonruntime access into IN_LIBGCOV defines.

>   memcpy (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, 4);
> }
>  #else
> @@ -497,7 +497,7 @@ gcov_read_words (unsigned words)
>gcov_var.offset = 0;
>gcov_var.length = excess;
>  #if IN_LIBGCOV
> -  gcc_assert (!gcov_var.length || gcov_var.length == 1);
> +  gcov_nonruntime_assert (!gcov_var.length || gcov_var.length == 1);
>excess = GCOV_BLOCK_SIZE;
>  #else
>if (gcov_var.length + words > gcov_var.alloc)
> @@ -614,7 +614,7 @@ gcov_read_summary (struct gcov_summary *summary)
>while (!cur_bitvector)
>  {
>h_ix = bv_ix * 32;
> -  gcc_assert (bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE);
> +  gcov_nonruntime_assert (bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE);
>cur_bitvector = histo_bitvector[bv_ix++];
>  }
>while (!(cur_bitvector & 0x1))
> @@ -622,7 +622,7 @@ gcov_read_summary (struct gcov_summary *summary)
>h_ix++;
>cur_bitvector >>= 1;
>  }
> -  gcc_assert (h_ix < GCOV_HISTOGRAM_SIZE);
> +  gcov_nonruntime_assert (h_ix < GCOV_HISTOGRAM_SIZE);

How many of those asserts can be triggered by a corrupted gcda file?
I would like to make libgcov more safe WRT file corruptions, too, so in that
case we should produce an error message.

The rest of changes seems OK.

Honza

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

2014-01-09 Thread Richard Earnshaw

On 30/03/13 16:10, Tom de Vries wrote:
> On 29/03/13 13:54, Tom de Vries wrote:
>> I split the patch up into 10 patches, to facilitate further review:
>> ...
>> 0001-Add-command-line-option.patch
>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>> 0006-Collect-register-usage-information.patch
>> 0007-Use-collected-register-usage-information.patch
>> 0008-Enable-by-default-at-O2-and-higher.patch
>> 0009-Add-documentation.patch
>> 0010-Add-test-case.patch
>> ...
>> I'll post these in reply to this email.
>>
> 
> Something went wrong with those emails, which were generated.
> 
> I tested the emails by sending them to my work email, where they looked fine.
> I managed to reproduce the problem by sending them to my private email.
> It seems the problem was inconsistent EOL format.
> 
> I've written a python script to handle composing the email, and posted it here
> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
> Given that that email looks ok, I think I've addressed the problems now.
> 
> I'll repost the patches. Sorry about the noise.
> 
> Thanks,
> - Tom
> 
> 

It's unfortunate that this feature doesn't fail safe when a port has not
explicitly defined what should happen.

Consequently, you'll need to add a patch for AArch64 which has two
registers clobbered by PLT-based calls.

R.

[Patch] Avoid gcc_assert in libgcov

2014-01-09 Thread Teresa Johnson

As suggested by Honza, avoid bloating libgcov from gcc_assert by using
a new macro gcov_nonruntime_assert in gcov-io.c that is only mapped to
gcc_assert when not in libgcov.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?

Thanks,
Teresa

2014-01-09  Teresa Johnson  

* gcov-io.c (gcov_position): Use gcov_nonruntime_assert.
(gcov_is_error): Ditto.
(gcov_rewrite): Ditto.
(gcov_open): Ditto.
(gcov_write_words): Ditto.
(gcov_write_length): Ditto.
(gcov_read_words): Ditto.
(gcov_read_summary): Ditto.
(gcov_sync): Ditto.
(gcov_seek): Ditto.
(gcov_histo_index): Ditto.
(static void gcov_histogram_merge): Ditto.
(compute_working_sets): Ditto.
* gcov-io.h (gcov_nonruntime_assert): Define.

Index: gcov-io.c
===
--- gcov-io.c   (revision 206435)
+++ gcov-io.c   (working copy)
@@ -67,7 +67,7 @@ GCOV_LINKAGE struct gcov_var
 static inline gcov_position_t
 gcov_position (void)
 {
-  gcc_assert (gcov_var.mode > 0);
+  gcov_nonruntime_assert (gcov_var.mode > 0);
   return gcov_var.start + gcov_var.offset;
 }

@@ -83,7 +83,7 @@ gcov_is_error (void)
 GCOV_LINKAGE inline void
 gcov_rewrite (void)
 {
-  gcc_assert (gcov_var.mode > 0);
+  gcov_nonruntime_assert (gcov_var.mode > 0);
   gcov_var.mode = -1;
   gcov_var.start = 0;
   gcov_var.offset = 0;
@@ -133,7 +133,7 @@ gcov_open (const char *name, int mode)
   s_flock.l_pid = getpid ();
 #endif

-  gcc_assert (!gcov_var.file);
+  gcov_nonruntime_assert (!gcov_var.file);
   gcov_var.start = 0;
   gcov_var.offset = gcov_var.length = 0;
   gcov_var.overread = -1u;
@@ -291,14 +291,14 @@ gcov_write_words (unsigned words)
 {
   gcov_unsigned_t *result;

-  gcc_assert (gcov_var.mode < 0);
+  gcov_nonruntime_assert (gcov_var.mode < 0);
 #if IN_LIBGCOV
   if (gcov_var.offset >= GCOV_BLOCK_SIZE)
 {
   gcov_write_block (GCOV_BLOCK_SIZE);
   if (gcov_var.offset)
{
- gcc_assert (gcov_var.offset == 1);
+ gcov_nonruntime_assert (gcov_var.offset == 1);
  memcpy (gcov_var.buffer, gcov_var.buffer + GCOV_BLOCK_SIZE, 4);
}
 }
@@ -393,9 +393,9 @@ gcov_write_length (gcov_position_t position)
   gcov_unsigned_t length;
   gcov_unsigned_t *buffer;

-  gcc_assert (gcov_var.mode < 0);
-  gcc_assert (position + 2 <= gcov_var.start + gcov_var.offset);
-  gcc_assert (position >= gcov_var.start);
+  gcov_nonruntime_assert (gcov_var.mode < 0);
+  gcov_nonruntime_assert (position + 2 <= gcov_var.start + gcov_var.offset);
+  gcov_nonruntime_assert (position >= gcov_var.start);
   offset = position - gcov_var.start;
   length = gcov_var.offset - offset - 2;
   buffer = (gcov_unsigned_t *) &gcov_var.buffer[offset];
@@ -481,14 +481,14 @@ gcov_read_words (unsigned words)
   const gcov_unsigned_t *result;
   unsigned excess = gcov_var.length - gcov_var.offset;

-  gcc_assert (gcov_var.mode > 0);
+  gcov_nonruntime_assert (gcov_var.mode > 0);
   if (excess < words)
 {
   gcov_var.start += gcov_var.offset;
 #if IN_LIBGCOV
   if (excess)
{
- gcc_assert (excess == 1);
+ gcov_nonruntime_assert (excess == 1);
  memcpy (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, 4);
}
 #else
@@ -497,7 +497,7 @@ gcov_read_words (unsigned words)
   gcov_var.offset = 0;
   gcov_var.length = excess;
 #if IN_LIBGCOV
-  gcc_assert (!gcov_var.length || gcov_var.length == 1);
+  gcov_nonruntime_assert (!gcov_var.length || gcov_var.length == 1);
   excess = GCOV_BLOCK_SIZE;
 #else
   if (gcov_var.length + words > gcov_var.alloc)
@@ -614,7 +614,7 @@ gcov_read_summary (struct gcov_summary *summary)
   while (!cur_bitvector)
 {
   h_ix = bv_ix * 32;
-  gcc_assert (bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE);
+  gcov_nonruntime_assert (bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE);
   cur_bitvector = histo_bitvector[bv_ix++];
 }
   while (!(cur_bitvector & 0x1))
@@ -622,7 +622,7 @@ gcov_read_summary (struct gcov_summary *summary)
   h_ix++;
   cur_bitvector >>= 1;
 }
-  gcc_assert (h_ix < GCOV_HISTOGRAM_SIZE);
+  gcov_nonruntime_assert (h_ix < GCOV_HISTOGRAM_SIZE);

   csum->histogram[h_ix].num_counters = gcov_read_unsigned ();
   csum->histogram[h_ix].min_value = gcov_read_counter ();
@@ -642,7 +642,7 @@ gcov_read_summary (struct gcov_summary *summary)
 GCOV_LINKAGE void
 gcov_sync (gcov_position_t base, gcov_unsigned_t length)
 {
-  gcc_assert (gcov_var.mode > 0);
+  gcov_nonruntime_assert (gcov_var.mode > 0);
   base += length;
   if (base - gcov_var.start <= gcov_var.length)
 gcov_var.offset = base - gcov_var.start;
@@ -661,7 +661,7 @@ gcov_sync (gcov_position_t base, gcov_unsigned_t l
 GCOV_LINKAGE void
 gcov_seek (gcov_position_t base)
 {
-  gcc_assert (gcov_var.mode

A question about forward_addr.

2014-01-09 Thread Peter Xu

Hi all,
I'm confused by the annotation in shold_replace_address.
Here is the code in fwprop.c: 

/* OLD is a memory address.  Return whether it is good to use NEW instead, 
   for a memory access in the given MODE.  */ 

static bool 
should_replace_address (rtx old_rtx, rtx new_rtx, enum machine_mode mode, 
addr_space_t as, bool speed) 
{ 
  int gain; 

  if (rtx_equal_p (old_rtx, new_rtx) 
  || !memory_address_addr_space_p (mode, new_rtx, as)) 
return false; 

  /* Copy propagation is always ok.  */ 
  if (REG_P (old_rtx) && REG_P (new_rtx)) 
return true; 

  */* Prefer the new address if it is less expensive.  */ 
  gain = (address_cost (old_rtx, mode, as, speed) 
  - address_cost (new_rtx, mode, as, speed)); 

  /* If the addresses have equivalent cost, prefer the new address 
 if it has the highest `set_src_cost'.  That has the potential of 
 eliminating the most insns without additional costs, and it 
 is the same that cse.c used to do.  */ 
  if (gain == 0) 
gain = set_src_cost (new_rtx, speed) - set_src_cost (old_rtx, speed); 

  return (gain > 0);
*} 

According to the annotation, the 'return (gain > 0)' shouldn't be 'return
(gain >= 0)' ? 

Here is the case for forward_addr. 
insn set r155 
   plus r167 + 32 
insn set mem (155) 
   r188 
insn set mem (plus r155 + 8) 
   r189 
.. 

If it is handled by the original code, 
the result will be: 
insn set r155 
   plus r167 + 32 
insn set mem (r167 + 32) 
   r188 
insn set mem (plus r155 + 8) 
   r189 

However it is expected to be: 
insn set mem (r167 + 32) 
   r188 
insn set mem (plus r167 + 40) 
   r189 

As the cost of 'addr r155' + 8 is equal to 'addr r167 + 40', so I think
that we should preffer to take the new addr, technically will be profitable
??? 


Brs, 
   Peter Xu. 




-
Dying in the sun.
--
View this message in context: 
http://gcc.1065356.n5.nabble.com/A-question-about-forward-addr-tp1001126.html
Sent from the gcc - patches mailing list archive at Nabble.com.

Re: [PATCH] Fix PR49718 : allow no_instrument_function attribute in class member definition/declaration

2014-01-09 Thread Laurent Alfonsi


On 01/09/14 06:02, Jeff Law wrote:

On 01/08/14 02:05, Laurent Alfonsi wrote:

  All,

I was looking at PR49718. I have enclosed a simple fix for this bug report.

2014-01-07  Laurent Alfonsi 

  * c-family/c-common.c (handle_no_instrument_function_attribute): Allow
no_instrument_function attribute in class member
definition/declaration.


Looking at the implementation of the function attributes, I see no
reason anymore to keep this error message.
Let me know if I missed something.
I have also added a testcase in the enclosed patch.

2014-01-07  Laurent Alfonsi 

  PR c++/49718
  * g++.dg/pr49718.C: New

Isn't the idea here that if we've already generated the function body
(presumably with instrumentation) that a no-instrument attribute
appearing on a later declaration won't do anything useful?

jeff



Jeff,

You are right. That's probably the reason.
From what i can see, the code instrumentation is performed in the 
gimplification pass (gimplify_function_tree), and the function attribute 
is handled and attached earlier in the parsing phase.


I ve checked with an example like :
---8<--8<--8<--8<--8<---
int foo () {
  return 2;
}

int bar () {
  return 1;
}

int foo () __attribute__((no_instrument_function));
---8<--8<--8<--8<--8<---
The attribute is well honored on foo function.
I might need to add this test case too.

Let me know if fix is ok.

Thanks
Laurent

Re: [PATCH] libsanitizer demangling using cp-demangle.c

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 05:51:05PM +0400, Konstantin Serebryany wrote:
> On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek  wrote:
> > On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote:
> >> There was a recent buggy patch to the demangler that added calls to
> >> malloc and realloc (2013-10-25 Gary Benson ).
> >> That patch must be fixed or reverted before the 4.9 release.  The main
> >> code in the demangler must not call malloc/realloc.
> >>
> >> When that patch is fixed, you can use the cplus_demangle_v3_callback
> >> function to get a demangler that never calls malloc.
> >
> > AFAIK Gary is working on a fix, when that is fixed, with the following
> > patch libsanitizer (when using libbacktrace for symbolization) will not
> > use system malloc/realloc/free for the demangling at all.
> >
> > Tested on x86_64-linux (-m64/-m32).  Note that the changes for the 3 files
> > unfortunately will need to be applied upstream to compiler-rt, is that
> > possible?
> >
> > 2013-12-10  Jakub Jelinek  
> >
> > * sanitizer_common/sanitizer_symbolizer_libbacktrace.h
> > (LibbacktraceSymbolizer::Demangle): New declaration.
> > * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
> 
> sanitizer_symbolizer_posix_libcdep.cc is the file from upstream.
> If it gets any change in the GCC variant, I will not be able to do
> merges from upstream until the same code is applied upstream.

Sure, but we are nearing GCC 4.9 stage3 finish and really need to demangle
the libbacktrace provided output.  Has the compiler-rt situation been
cleared up?  Haven't seen any follow-ups after Chandler's reversion.
So, this change is meant to be temporary, with hope that in upstream this
will be resolved, either with the same patch or something similar.

Jakub

[PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS

2014-01-09 Thread Tom de Vries

On 25/12/13 14:02, Tom de Vries wrote:
> On 07-12-13 16:07, Tom de Vries wrote:
>> Richard,
>>
>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
>> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
>> address the issue that $6 is sometimes used in split calls.
>>
>> Build and reg-tested on MIPS.
>>
>> OK for stage1?
>>
> 

Richard,

Ping.

This patch is the only part of -fuse-caller-save that still needs approval.

> This patch was submitted here ( 
> http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00771.html ) and is required for 
> the -fuse-caller-save optimization which was submitted here ( 
> http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
> 
> The patch fixes a correctness issue with -fuse-caller-save for MIPS.
> 
> OK for stage1?
> 

Thanks,
- Tom

Re: [PATCH] libsanitizer demangling using cp-demangle.c

2014-01-09 Thread Konstantin Serebryany

On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek  wrote:
> On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote:
>> There was a recent buggy patch to the demangler that added calls to
>> malloc and realloc (2013-10-25 Gary Benson ).
>> That patch must be fixed or reverted before the 4.9 release.  The main
>> code in the demangler must not call malloc/realloc.
>>
>> When that patch is fixed, you can use the cplus_demangle_v3_callback
>> function to get a demangler that never calls malloc.
>
> AFAIK Gary is working on a fix, when that is fixed, with the following
> patch libsanitizer (when using libbacktrace for symbolization) will not
> use system malloc/realloc/free for the demangling at all.
>
> Tested on x86_64-linux (-m64/-m32).  Note that the changes for the 3 files
> unfortunately will need to be applied upstream to compiler-rt, is that
> possible?
>
> 2013-12-10  Jakub Jelinek  
>
> * sanitizer_common/sanitizer_symbolizer_libbacktrace.h
> (LibbacktraceSymbolizer::Demangle): New declaration.
> * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc

sanitizer_symbolizer_posix_libcdep.cc is the file from upstream.
If it gets any change in the GCC variant, I will not be able to do
merges from upstream until the same code is applied upstream.


> (POSIXSymbolizer::Demangle): Use libbacktrace_symbolizer_'s Demangle
> method if possible.
> * sanitizer_common/sanitizer_symbolizer_libbacktrace.cc: Include
> "demangle.h" if SANITIZE_CP_DEMANGLE is defined.
> (struct CplusV3DemangleData): New type.
> (CplusV3DemangleCallback, CplusV3Demangle): New functions.
> (SymbolizeCodePCInfoCallback, SymbolizeCodeCallback,
> SymbolizeDataCallback): Use CplusV3Demangle.
> * sanitizer_common/Makefile.am (AM_CXXFLAGS): Add
> -DSANITIZE_CP_DEMANGLE and -I $(top_srcdir)/../include.
> * libbacktrace/backtrace-rename.h (cplus_demangle_builtin_types,
> cplus_demangle_fill_ctor, cplus_demangle_fill_dtor,
> cplus_demangle_fill_extended_operator, cplus_demangle_fill_name,
> cplus_demangle_init_info, cplus_demangle_mangled_name,
> cplus_demangle_operators, cplus_demangle_print,
> cplus_demangle_print_callback, cplus_demangle_type, cplus_demangle_v3,
> cplus_demangle_v3_callback, is_gnu_v3_mangled_ctor,
> is_gnu_v3_mangled_dtor, java_demangle_v3, java_demangle_v3_callback):
> Define.
> (__asan_internal_memcmp, __asan_internal_strncmp): New prototypes.
> (memcmp, strncmp): Redefine.
> * libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add
> ../../libiberty/cp-demangle.c.
> * libbacktrace/bridge.cc (__asan_internal_memcmp,
> __asan_internal_strncmp): New functions.
> * sanitizer_common/Makefile.in: Regenerated.
> * libbacktrace/Makefile.in: Regenerated.
> * configure: Regenerated.
> * configure.ac: Regenerated.
> * config.h.in: Regenerated.
>
> --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h.jj  
>   2013-12-05 12:04:28.0 +0100
> +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h   
> 2013-12-10 11:01:26.777371566 +0100
> @@ -29,6 +29,8 @@ class LibbacktraceSymbolizer {
>
>bool SymbolizeData(DataInfo *info);
>
> +  const char *Demangle(const char *name);
> +
>   private:
>explicit LibbacktraceSymbolizer(void *state) : state_(state) {}
>
> --- libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc.jj
>   2013-12-05 12:04:28.0 +0100
> +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc 
> 2013-12-10 11:03:02.971876505 +0100
> @@ -513,6 +513,11 @@ class POSIXSymbolizer : public Symbolize
>  SymbolizerScope sym_scope(this);
>  if (internal_symbolizer_ != 0)
>return internal_symbolizer_->Demangle(name);
> +if (libbacktrace_symbolizer_ != 0) {
> +  const char *demangled = libbacktrace_symbolizer_->Demangle(name);
> +  if (demangled)
> +   return demangled;
> +}
>  return DemangleCXXABI(name);
>}
>
> --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc.jj 
>   2013-12-09 14:32:06.0 +0100
> +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc  
> 2013-12-10 11:48:19.803830291 +0100
> @@ -20,6 +20,10 @@
>  # include "backtrace-supported.h"
>  # if SANITIZER_POSIX && BACKTRACE_SUPPORTED && !BACKTRACE_USES_MALLOC
>  #  include "backtrace.h"
> +#  if SANITIZER_CP_DEMANGLE
> +#   undef ARRAY_SIZE
> +#   include "demangle.h"
> +#  endif
>  # else
>  #  define SANITIZER_LIBBACKTRACE 0
>  # endif
> @@ -31,6 +35,60 @@ namespace __sanitizer {
>
>  namespace {
>
> +#if SANITIZER_CP_DEMANGLE
> +struct CplusV3DemangleData {
> +  char *buf;
> +  uptr size, allocated;
> +};
> +
> +extern "C" {
> +static void CplusV3DemangleCallback(const char *s, size_t l, void *vdata) {
> +  Cp

Re: [Patch, Fortran, committed] PR 59612: iso_fortran_env segfaults with -fdump-fortran-original

2014-01-09 Thread Janus Weil

After noticing that the bug is actually a regression (see PR 57042):
Ok to backport to 4.7 and 4.8?

Cheers,
Janus



2013/12/29 Janus Weil :
> Hi all,
>
> I have just committed an obvious patch for a segfault with
> -fdump-fortran-original (plus a small documentation fix):
>
> http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=206237
>
> Cheers,
> Janus

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-09 Thread Richard Biener

On Thu, 9 Jan 2014, Jakub Jelinek wrote:

> On Thu, Jan 09, 2014 at 02:13:39PM +0100, Richard Biener wrote:
> > On Thu, 9 Jan 2014, Jakub Jelinek wrote:
> > 
> > > On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote:
> > > > > gimplify_modify_expr has:
> > > > > 
> > > > >   if (!gimple_call_noreturn_p (assign))
> > > > > gimple_call_set_lhs (assign, *to_p);
> > > > 
> > > > Ok, it seems to be too early then - move it after the folding.
> > > 
> > > That wouldn't help all the other early calls of fold_stmt though.
> > > E.g. lower_omp.  Plus, even in gimplify_modify_expr, doing it
> > > after fold_stmt would mean having to walk all stmts created by the 
> > > folding?,
> > > check if they are calls (because a call can fold into nothing or something
> > > completely different).  Isn't it better then fold_stmt does that instead?
> > 
> > Hmm, maybe.  Not sure why we are this anal about requiring noreturn
> > calls not to have a LHS.  But if we require callers in SSA form
> > to update the stmt and properly cleanup the cfg if fold_stmt returns
> > true then it's reasonable to require at least "something" for callers
> > from non-SSA/CFG code.
> > 
> > That is, I don't like this special-casing.  If so, then rather
> > don't fold at this point - thus if (... !inplace && in_ssa_form (cfun) 
> > ...) (or rather if we have a CFG - cfun && cfun->curr_properties & 
> > PROP_cfg).
> 
> But, isn't right now gimplification the only guaranteed folding of all
> stmts?

Actually gimplification doesn't fold all stmts either, but yes.

> I mean, other passes fold_stmt only if they propagate something into
> them, don't they?  Also, in most cases the call actually isn't noreturn,
> so stopping all the devirtualization just for the unlikely case doesn't look
> like a good idea to me.
> 
> Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition
> we can just drop the lhs always in that case, just doing what we do for
> __builtin_unreachable if lhs is SSA_NAME:
>   tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
>   tree def = get_or_create_ssa_default_def (cfun, var);
>   gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT);

That works for me.

Richard.

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 02:13:39PM +0100, Richard Biener wrote:
> On Thu, 9 Jan 2014, Jakub Jelinek wrote:
> 
> > On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote:
> > > > gimplify_modify_expr has:
> > > > 
> > > >   if (!gimple_call_noreturn_p (assign))
> > > > gimple_call_set_lhs (assign, *to_p);
> > > 
> > > Ok, it seems to be too early then - move it after the folding.
> > 
> > That wouldn't help all the other early calls of fold_stmt though.
> > E.g. lower_omp.  Plus, even in gimplify_modify_expr, doing it
> > after fold_stmt would mean having to walk all stmts created by the folding?,
> > check if they are calls (because a call can fold into nothing or something
> > completely different).  Isn't it better then fold_stmt does that instead?
> 
> Hmm, maybe.  Not sure why we are this anal about requiring noreturn
> calls not to have a LHS.  But if we require callers in SSA form
> to update the stmt and properly cleanup the cfg if fold_stmt returns
> true then it's reasonable to require at least "something" for callers
> from non-SSA/CFG code.
> 
> That is, I don't like this special-casing.  If so, then rather
> don't fold at this point - thus if (... !inplace && in_ssa_form (cfun) 
> ...) (or rather if we have a CFG - cfun && cfun->curr_properties & 
> PROP_cfg).

But, isn't right now gimplification the only guaranteed folding of all
stmts?  I mean, other passes fold_stmt only if they propagate something into
them, don't they?  Also, in most cases the call actually isn't noreturn,
so stopping all the devirtualization just for the unlikely case doesn't look
like a good idea to me.

Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition
we can just drop the lhs always in that case, just doing what we do for
__builtin_unreachable if lhs is SSA_NAME:
  tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
  tree def = get_or_create_ssa_default_def (cfun, var);
  gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT);

Jakub

Improving mklog [was: Re: RFC Asan instrumentation control]

2014-01-09 Thread Tatiana Udalova

Hello,

I have reproduced the problem with mklog mentioned by Jakub:

> In my experience mklog is pretty much useless, e.g. if you add a new 
> function, it will list the previous function as being modified rather 
> than the new one, etc.

My focus was on functions from headers of diff-log chunks.

I hacked a simple addition to mklog which skips unchanged functions in
diff-log while adding function names to the final ChangeLog.

New mklog results were verified by testsuite which compares reference
ChangeLogs of patches from gcc trunk with logs generated by mklog.

Patched mklog considerably reduced the number of unchanged functions in
ChangeLog.

Is it OK for trunk?

Thank you,
Tatiana Udalova




mklog_patch.diff
Description: Binary data

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-09 Thread Richard Biener

On Thu, 9 Jan 2014, Jakub Jelinek wrote:

> On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote:
> > > gimplify_modify_expr has:
> > > 
> > >   if (!gimple_call_noreturn_p (assign))
> > > gimple_call_set_lhs (assign, *to_p);
> > 
> > Ok, it seems to be too early then - move it after the folding.
> 
> That wouldn't help all the other early calls of fold_stmt though.
> E.g. lower_omp.  Plus, even in gimplify_modify_expr, doing it
> after fold_stmt would mean having to walk all stmts created by the folding?,
> check if they are calls (because a call can fold into nothing or something
> completely different).  Isn't it better then fold_stmt does that instead?

Hmm, maybe.  Not sure why we are this anal about requiring noreturn
calls not to have a LHS.  But if we require callers in SSA form
to update the stmt and properly cleanup the cfg if fold_stmt returns
true then it's reasonable to require at least "something" for callers
from non-SSA/CFG code.

That is, I don't like this special-casing.  If so, then rather
don't fold at this point - thus if (... !inplace && in_ssa_form (cfun) 
...) (or rather if we have a CFG - cfun && cfun->curr_properties & 
PROP_cfg).

Richard.

Re: [PATCH] Fix PR45586

2014-01-09 Thread Jakub Jelinek

On Thu, Jan 09, 2014 at 12:48:49PM +0100, Richard Biener wrote:
> *** gimple_canonical_types_compatible_p (tre
> *** 458,465 
>   return true;
>   
> /* Can't be the same type if they have different alignment, or mode.  */
> !   if (TYPE_ALIGN (t1) != TYPE_ALIGN (t2)
> !   || TYPE_MODE (t1) != TYPE_MODE (t2))
>   return false;
>   
> /* Non-aggregate types can be handled cheaply.  */
> --- 451,457 
>   return true;
>   
> /* Can't be the same type if they have different alignment, or mode.  */
> !   if (TYPE_MODE (t1) != TYPE_MODE (t2))

The comment needs updating then.

Jakub

1 2 >

1 - 100 of 133 matches

Mail list logo