date:20140520

can_refer_decl_in_current_unit_p TLC

2014-05-20 Thread Jan Hubicka

Hi,
can_refer_decl_in_current_unit_p is a black magic that carefuly papers around
a mine field created by combination of C++ ABI and ELF visibility attributes.
As such it is not pretty and grown up into quite a maze of rules.

Well, I think it is time to clean it up: for example it does twice very similar
check - for statics and for comdats it checks that definition is still around,
but the code used is very different (one written before symtab, other more
modern). It is also wrong in some cases that apparently don't happen in 
pracitce;
for example it check presence of symtab node for static symbol but it does not
check that the definition is actually around to be output.

This patch cleans it up and should make it more robust and also less
conservative for WHOPR and COMDATs put into other partition. It also fixes
tree-optimization/60899 (and updates devirt-11.C testcase that tests the
wrong devirtualization happens), I will send release branch fix shortly.

Bootstrapped/regtested x86_64-linux, will comitted.

Honza

PR tree-optimization/60899
* gimple-fold.c (can_refer_decl_in_current_unit_p): Cleanup;
assume all static symbols will have definition wile parsing and
check the do have definition later in compilation; check that
variable referring symbol will be output before concluding that
reference is safe; be conservative for referring local statics;
be more precise about when comdat is output in other partition.

g++.dg/ipa/devirt-11.C: Update template.

Index: gimple-fold.c
===
--- gimple-fold.c   (revision 210653)
+++ gimple-fold.c   (working copy)
@@ -93,8 +93,12 @@ can_refer_decl_in_current_unit_p (tree d
   /* Static objects can be referred only if they was not optimized out yet.  */
   if (!TREE_PUBLIC (decl) && !DECL_EXTERNAL (decl))
 {
+  /* Before we start optimizing unreachable code we can be sure all
+static objects are defined.  */
+  if (cgraph_function_flags_ready)
+   return true;
   snode = symtab_get_node (decl);
-  if (!snode)
+  if (!snode || !snode->definition)
return false;
   node = dyn_cast  (snode);
   return !node || !node->global.inlined_to;
@@ -102,10 +106,12 @@ can_refer_decl_in_current_unit_p (tree d
 
   /* We will later output the initializer, so we can refer to it.
  So we are concerned only when DECL comes from initializer of
- external var.  */
+ external var or var that has been optimized out.  */
   if (!from_decl
   || TREE_CODE (from_decl) != VAR_DECL
-  || !DECL_EXTERNAL (from_decl)
+  || (!DECL_EXTERNAL (from_decl)
+ && (vnode = varpool_get_node (from_decl)) != NULL
+ && vnode->definition)
   || (flag_ltrans
  && symtab_get_node (from_decl)->in_other_partition))
 return true;
@@ -122,9 +128,9 @@ can_refer_decl_in_current_unit_p (tree d
  reference imply need to include function body in the curren tunit.  */
   if (TREE_PUBLIC (decl) && !DECL_COMDAT (decl))
 return true;
-  /* We are not at ltrans stage; so don't worry about WHOPR.
- Also when still gimplifying all referred comdat functions will be
- produced.
+  /* We have COMDAT.  We are going to check if we still have definition
+ or if the definition is going to be output in other partition.
+ Bypass this when gimplifying; all needed functions will be produced.
 
  As observed in PR20991 for already optimized out comdat virtual functions
  it may be tempting to not necessarily give up because the copy will be
@@ -133,35 +139,17 @@ can_refer_decl_in_current_unit_p (tree d
  units where they are used and when the other unit was compiled with LTO
  it is possible that vtable was kept public while the function itself
  was privatized. */
-  if (!flag_ltrans && (!DECL_COMDAT (decl) || !cgraph_function_flags_ready))
+  if (!cgraph_function_flags_ready)
 return true;
 
-  /* OK we are seeing either COMDAT or static variable.  In this case we must
- check that the definition is still around so we can refer it.  */
-  if (TREE_CODE (decl) == FUNCTION_DECL)
-{
-  node = cgraph_get_node (decl);
-  /* Check that we still have function body and that we didn't took
- the decision to eliminate offline copy of the function yet.
- The second is important when devirtualization happens during final
- compilation stage when making a new reference no longer makes callee
- to be compiled.  */
-  if (!node || !node->definition || node->global.inlined_to)
-   {
- gcc_checking_assert (!TREE_ASM_WRITTEN (decl));
- return false;
-   }
-}
-  else if (TREE_CODE (decl) == VAR_DECL)
-{
-  vnode = varpool_get_node (decl);
-  if (!vnode || !vnode->definition)
-   {
- gcc_checking_assert (!TREE_ASM_WRITTEN (decl));
- return false;
-   }
-}

Silence devirt-11.C testcase on GCC 4.9

2014-05-20 Thread Jan Hubicka

Hi,
devirt-11.C is somewhat fragile testcase that originally was testing a specific 
series of events
in inliner that are no longer happening anyway. It tests for a specific number 
of devirutalization that
depends on several aspects.  It seems more practical to relax this check, 
because it is kind of
pointless.

Regtested and commited to gcc-4.9 branch.

Honza

Index: g++.dg/ipa/devirt-11.C
===
--- g++.dg/ipa/devirt-11.C  (revision 210672)
+++ g++.dg/ipa/devirt-11.C  (working copy)
@@ -45,5 +45,5 @@ bar ()
 /* While inlining function called once we should devirtualize a new call to fn2
and two to fn3. While doing so the new symbol for fn2 needs to be
introduced.  */
-/* { dg-final { scan-ipa-dump-times "Discovered a virtual call to a known 
target" 1 "inline"  } } */
+/* { dg-final { scan-ipa-dump "Discovered a virtual call to a known target" 
"inline"  } } */
 /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: ChangeLog
===
--- ChangeLog   (revision 210672)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2014-05-18  Jan Hubicka  
+
+   PR middle-end/58094
+   * g++.dg/ipa/devirt-11.C: Be lax about number of devirtualizations.
+
 2014-05-18  Eric Botcazou  
 
* gnat.dg/enum3.adb: New test.

Fix AIX bootstrap failure

2014-05-20 Thread Jan Hubicka

Hi,
on AIX GCC 4.9 as well as mainline dies on callgraph corruption problem.  THis 
is caused
by ipa-inline walking node that is removed by ipa-inline-transform (because it 
is a local
alias that become unreachable).
This patch solves it by adding a hook to inform user of inline_call when this 
happen.

Bootstrapped ppc-aix, bootstrapped/regtested x86_64-linux, commited to mainline 
and branch.

PR bootstrap/60984
* ipa-inline-transform.c (inline_call): Use add CALLEE_REMOVED 
parameter.
* ipa-inline.c (inline_to_all_callers): If callee was removed; return.
(ipa_inline): Loop inline_to_all_callers until no more aliases are 
removed.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 210624)
+++ ipa-inline-transform.c  (working copy)
@@ -214,6 +214,7 @@
it is NULL. If UPDATE_OVERALL_SUMMARY is false, do not bother to recompute 
overall
size of caller after inlining. Caller is required to eventually do it via
inline_update_overall_summary.
+   If callee_removed is non-NULL, set it to true if we removed callee node.
 
Return true iff any new callgraph edges were discovered as a
result of inlining.  */
@@ -221,7 +222,8 @@
 bool
 inline_call (struct cgraph_edge *e, bool update_original,
 vec *new_edges,
-int *overall_size, bool update_overall_summary)
+int *overall_size, bool update_overall_summary,
+bool *callee_removed)
 {
   int old_size = 0, new_size = 0;
   struct cgraph_node *to = NULL;
@@ -260,6 +262,8 @@
{
  next_alias = cgraph_alias_target (alias);
  cgraph_remove_node (alias);
+ if (callee_removed)
+   *callee_removed = true;
  alias = next_alias;
}
  else
Index: ipa-inline.c
===
--- ipa-inline.c(revision 210624)
+++ ipa-inline.c(working copy)
@@ -1971,6 +1971,8 @@
 inline_to_all_callers (struct cgraph_node *node, void *data)
 {
   int *num_calls = (int *)data;
+  bool callee_removed = false;
+
   while (node->callers && !node->global.inlined_to)
 {
   struct cgraph_node *caller = node->callers->caller;
@@ -1987,7 +1989,7 @@
   inline_summary (node->callers->caller)->size);
}
 
-  inline_call (node->callers, true, NULL, NULL, true);
+  inline_call (node->callers, true, NULL, NULL, true, &callee_removed);
   if (dump_file)
fprintf (dump_file,
 " Inlined into %s which now has %i size\n",
@@ -1997,8 +1999,10 @@
{
  if (dump_file)
fprintf (dump_file, "New calls found; giving up.\n");
- return true;
+ return callee_removed;
}
+  if (callee_removed)
+   return true;
 }
   return false;
 }
@@ -2244,8 +2248,9 @@
  int num_calls = 0;
  cgraph_for_node_and_aliases (node, sum_callers,
   &num_calls, true);
- cgraph_for_node_and_aliases (node, inline_to_all_callers,
-  &num_calls, true);
+ while (cgraph_for_node_and_aliases (node, inline_to_all_callers,
+ &num_calls, true))
+   ;
  remove_functions = true;
}
}
Index: ipa-inline.h
===
--- ipa-inline.h(revision 210624)
+++ ipa-inline.h(working copy)
@@ -236,7 +236,8 @@
 bool speculation_useful_p (struct cgraph_edge *e, bool anticipate_inlining);
 
 /* In ipa-inline-transform.c  */
-bool inline_call (struct cgraph_edge *, bool, vec *, int *, 
bool);
+bool inline_call (struct cgraph_edge *, bool, vec *, int *, 
bool,
+ bool *callee_removed = NULL);
 unsigned int inline_transform (struct cgraph_node *);
 void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *,
  int freq_scale);

Re: [PATCH] Install sanitizer public headers (fix for PR sanitizer/61100)

2014-05-20 Thread Yury Gribov


> The new test fails on x86_64 Ubuntu 12.04 native build:

Yes, this should be fixed once kcc's patches get into trunk 
(https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00756.html). Current 
asan_interface.h is indeed C++-only.


-Y

Re: Add flag to optionally ignore ELF interposition

2014-05-20 Thread Jan Hubicka

> On Tue, 20 May 2014, Jan Hubicka wrote:
> 
> > Hi,
> > as disucssed some time ago, our assumption that every symbol of shared 
> > library can
> > be interposed at runtime is expensive and prevents a lot of useful 
> > optimizations,
> > including inlining or IPA propagation.
> > 
> > While this is useful feature, it is rather incommon to use it for bigger C++
> > projects, like firefox and at least clang seems to ignore the ELF 
> > interposition
> > rules and always inline/propagate. This patch adds flag to control the 
> > behaviour.
> > Symbols explicitly delcared WEAK are still considered as interposable.
> 
> I could see a use for an option listing the symbols that can be 
> interposed.  (For example, glibc supports interposition of a limited 
> number of malloc-related symbols, but if it were made to support building 
> with LTO in future that it would make sense to be able to optimize calls 
> to most other functions.)

I tought glibc uses handcoded local aliases for all its exported symbols except
for those where interposition is allowed.
But yes, this mechanism does kind of similar thing - in addition to enabling 
optimizations
ipa.c knows how to create local aliases for those symbols. Compiling:
$ cat ~/t.c
__attribute__
((noinline))
t()
{
  printf ("test\n");
}
q()
{
  t();
}
$ ./xgcc -B ./ -O2 ~/t.c -fno-semantic-interposition -S -fPIC

will get you:
.sett.localalias.0,t
.globl  q
.type   q, @function
q:
xorl%eax, %eax
jmp t.localalias.0

which is sort of what many libraries does by hand.

I wonder if this can't be best modeled as yet another visibility setting.  I.e. 
having visibility=default with
interposition assumed and visibility=default-no-semantic-interposition with 
visibility not allowed.
This can be both for -fvisibility flag and for attributes?

Honza
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com

Re: Eliminate write-only variables

2014-05-20 Thread Jan Hubicka

> 
> Unfortunately, this commit has caused the following ICE for me when
> LTO building 471.omnetpp from SPEC 2006 or Firefox (but not libxul,
> something that gets built earlier):
> 
> lto1: internal compiler error: in gimple_get_virt_method_for_vtable,
> at gimple-fold.c:3276
> 0x7437a3 gimple_get_virt_method_for_vtable(long, tree_node*, unsigned
> long, bool*)  
> /home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3276  
> 0x743993 gimple_get_virt_method_for_binfo(long, tree_node*, bool*)
> /home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3377  
> 0x7913f2 possible_polymorphic_call_targets(tree_node*, long,
> ipa_polymorphic_call_context, bool*, void**, int*)  
> /home/mjambor/gcc/bisect/src/gcc/ipa-devirt.c:1697  
> 0x7b73a9 possible_polymorphic_call_targets  
> /home/mjambor/gcc/bisect/src/gcc/ipa-utils.h:121  
> 0x7b73a9 walk_polymorphic_call_targets  
> /home/mjambor/gcc/bisect/src/gcc/ipa.c:177  
> 0x7b73a9 symtab_remove_unreachable_nodes(bool, _IO_FILE*)  
> /home/mjambor/gcc/bisect/src/gcc/ipa.c:407  
> 0x86bf47 execute_todo  
> /home/mjambor/gcc/bisect/src/gcc/passes.c:1843  
> Please submit a full bug report...
>  
> I compile omnetpp with -Ofast -g -flto=8 -fwhole-program
> -funroll-loops -fpeel-loops -march=native -mtune=native
> 
> I'm afraid I won't be able to prepare a more reduced testcase very
> soon.

Actually I think where is a problem: when the variable is completely dead
(that is no writes, reads and address taken), we still set the writeonly
flag and clear DECL_INITIAL.  This may happen for vtables still in use
by the type based devirt machinery. The following patch should fix it.

Bootstrapped/regtested and comitted to mainline. 
It also fixes the debug output issue you pointed out.

Index: ipa.c
===
--- ipa.c   (revision 210653)
+++ ipa.c   (working copy)
@@ -730,7 +730,7 @@ ipa_discover_readonly_nonaddressable_var
if (!address_taken)
  {
if (TREE_ADDRESSABLE (vnode->decl) && dump_file)
- fprintf (dump_file, " %s (addressable)", vnode->name ());
+ fprintf (dump_file, " %s (non-addressable)", vnode->name ());
varpool_for_node_and_aliases (vnode, clear_addressable_bit, NULL, 
true);
  }
if (!address_taken && !written
@@ -743,7 +743,7 @@ ipa_discover_readonly_nonaddressable_var
  fprintf (dump_file, " %s (read-only)", vnode->name ());
varpool_for_node_and_aliases (vnode, set_readonly_bit, NULL, true);
  }
-   if (!vnode->writeonly && !read && !address_taken)
+   if (!vnode->writeonly && !read && !address_taken && written)
  {
if (dump_file)
  fprintf (dump_file, " %s (write-only)", vnode->name ());

Re: [PATCH] Fix ARM NAN fraction bits

2014-05-20 Thread Maciej W. Rozycki

On Tue, 20 May 2014, Richard Earnshaw wrote:

> >>> If it's not broken in 4.8.2 but broken on the branch head then it's OK 
> >>> for the branch.
> >>
> >>  I thought I'd double-check with you that it is fine to push this change 
> >> to trunk first.  OK to apply?
> > 
> > Of course it should go to trunk (and 4.9 branch) first, but I'm not
> > the one to approve that.
> > 
> 
> I can...  Based on Joseph's review, OK.

 I have now applied it to trunk and backported to 4.9 and 4.8.  Thank you 
both for the review.

  Maciej

Re: [PATCH] Fix ICE in rtl-optimization/PR61220, PR61225

2014-05-20 Thread Zhenqiang Chen

On 21 May 2014 00:54, Jeff Law  wrote:
> On 05/20/14 01:11, Zhenqiang Chen wrote:
>>
>> Hi,
>>
>> The patch fix ICE issue triggered by shrink-wrapping enhancement.
>>
>> Bootstrap and no make check regression on X86-64.
>>
>> OK for trunk?
>>
>> Thanks!
>> -Zhenqiang
>>
>>
>> 2014-05-20  Zhenqiang Chen  
>>
>>  PR rtl-optimization/61220
>>  Part of PR rtl-optimization/61225
>>  * shrink-wrap.c (move_insn_for_shrink_wrap): Skip SP and FP
>> adjustment
>>  insn; skip split_edge for a block without only on successor.
>>
>> testsuite/ChangeLog:
>> 2014-05-20  Zhenqiang Chen  
>>
>>  * gcc.dg/pr61220.c: New test.
>>  * gcc.dg/shrink-wrap-loop.c: Disable for x86_64 -m32 mode.
>>
> [ ... ]
>
>
>
>
>> /* Make sure that the source register isn't defined later in BB.  */
>> @@ -204,6 +209,10 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn,
>> /* Create a new basic block on the edge.  */
>> if (EDGE_COUNT (next_block->preds) == 2)
>>   {
>> +  /* split_edge for a block with only one successor is meaningless.
>> */
>> +  if (EDGE_COUNT (bb->succs) == 1)
>> +   return false;
>> +
>> next_block = split_edge (live_edge);
>
> So don't you just want to use:
>
> if (EDGE_CRITICAL_P (live_edge))
>
> to replace the two explicit checks of EDGE_COUNT on the preds & succs?

Thanks for the comment.

In the code, there are 4 combinations of EDGE_COUNT: <1, 1>, <1, 2>,
<2, 1> and <2, 2>.  <2, 1> is "illegal". <2, 2> is legal, but need
split_edge. <1, *> can bypass the second check. EDGE_CRITICAL_P can
only distinguish <2, 2> from others.

So I think two explicit checks is more efficient than EDGE_CRITICAL_P.

Thanks!
-Zhenqiang

Re: [PATCH] Install sanitizer public headers (fix for PR sanitizer/61100)

2014-05-20 Thread Maxim Kuvyrkov

On May 14, 2014, at 4:13 AM, Yury Gribov  wrote:

> Hi,
> 
> Asan and Tsan allow sanitized applications to tweak runtime behavior via API 
> defined in headers in libsanitizer/include/sanitizer. This patch adds 
> installation code for these headers and a small test.
> 
> Bootstrapped and regtested on x64.
> 
> -Y
> 

The new test fails on x86_64 Ubuntu 12.04 native build:

/home/maxim-kuvyrkov/build/gcc-native-e-m-s-f/gcc/xgcc 
-B/home/maxim-kuvyrkov/build/gcc-native-e-m-s-f/gcc/ 
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/c-c++-common/asan/asan-interface-1.c
 
-B/home/maxim-kuvyrkov/build/gcc-native-e-m-s-f/x86_64-unknown-linux-gnu/./libsanitizer/
 
-B/home/maxim-kuvyrkov/build/gcc-native-e-m-s-f/x86_64-unknown-linux-gnu/./libsanitizer/asan/
 
-L/home/maxim-kuvyrkov/build/gcc-native-e-m-s-f/x86_64-unknown-linux-gnu/./libsanitizer/asan/.libs
 -fsanitize=address -g 
-I/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/../../libsanitizer/include
 -fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -lm -o 
./asan-interface-1.exe
In file included from 
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/c-c++-common/asan/asan-interface-1.c:5:0:
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/../../libsanitizer/include/sanitizer/asan_interface.h:53:3:
 error: unknown type name 'bool'
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/../../libsanitizer/include/sanitizer/asan_interface.h:66:40:
 error: unknown type name 'bool'
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/../../libsanitizer/include/sanitizer/asan_interface.h:88:3:
 error: unknown type name 'bool'
/home/maxim-kuvyrkov/build/src/gcc-e-m-s-f/gcc/testsuite/../../libsanitizer/include/sanitizer/asan_interface.h:98:3:
 error: unknown type name 'bool'
compiler exited with status 1

--
Maxim Kuvyrkov
www.linaro.org

[PATCH] proposed fix for bug # 61144

2014-05-20 Thread Rich Felker

Bug # 61144 is a regression in 4.9.0 that breaks building of musl libc
due to aggressive and semantically-incorrect constant folding of weak
aliases. The attached patch seems to fix the issue. A weak alias
should never be a candidate for constant folding because it may always
be replaced by a strong definition from another translation unit.

For details see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61144

I do not have a copyright assignment on file but this patch should be
sufficiently trivial not to require it.

Rich
diff --git a/gcc/varpool.c b/gcc/varpool.c
index b426757..905047e 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -195,6 +195,8 @@ ctor_for_folding (tree decl)
 {
   gcc_assert (!DECL_INITIAL (decl)
  || DECL_INITIAL (decl) == error_mark_node);
+  if (DECL_WEAK (decl))
+return error_mark_node;
   if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
{
  node = varpool_alias_target (node);

[patch, lto] add testcase for PR60179

2014-05-20 Thread Sandra Loosemore

One of the consequences of the (now-fixed) bug in PR60179 is that Nios 
II code using target pragmas to specify custom instructions failed to 
generate those instructions with -flto.  We came up with this test case 
for it, but there didn't seem to be an existing hook to scan LTO output. 
 Is this OK, or is there a better way to do it?


-Sandra


2014-05-20  Cesar Philippidis  
Sandra Loosemore  

gcc/testsuite/
* lib/scanasm.exp (scan-lto-assembler): New procedure.
* gcc.target/nios2/custom-fp-lto.c: New test.

Index: gcc/testsuite/lib/scanasm.exp
===
--- gcc/testsuite/lib/scanasm.exp	(revision 210659)
+++ gcc/testsuite/lib/scanasm.exp	(working copy)
@@ -500,3 +500,13 @@ proc dg-function-on-line { args } {
 
 append final-code "$cmd\n"
 }
+
+# Look for a pattern in the .exe.ltrans0.s file produced by the
+# compiler.  See dg-scan for details.
+
+proc scan-lto-assembler { args } {
+set testcase [testname-for-summary]
+set output_file "[file rootname [file tail $testcase]].exe.ltrans0.s"
+verbose "output_file: $output_file"
+dg-scan "scan-assembler" 1 $testcase $output_file $args
+}
Index: gcc/testsuite/gcc.target/nios2/custom-fp-lto.c
===
--- gcc/testsuite/gcc.target/nios2/custom-fp-lto.c	(revision 0)
+++ gcc/testsuite/gcc.target/nios2/custom-fp-lto.c	(revision 0)
@@ -0,0 +1,29 @@
+/* Test specification of custom instructions via pragma in the presence
+   of LTO.  This test case formerly failed due to PR60179.  */
+
+/* { dg-do link } */
+/* { dg-require-effective-target lto } */
+/* { dg-options "-O1 -flto -save-temps" } */
+
+/* -O1 in the options is significant.  Without it FP operations may not be
+   optimized to custom instructions.  */
+
+#include  
+#include 
+
+#pragma GCC target ("custom-fabss=224")
+
+float
+custom_fp (float operand_a)
+{
+  return fabsf (operand_a);
+}
+
+int
+main (int argc, char *argv[])
+{
+  return custom_fp ((float)argc) > 1.0;
+}
+
+/* { dg-final { scan-lto-assembler "custom\\t224, " } } */
+/* { dg-final { cleanup-saved-temps } } */

RE: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-20 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> 
> I'll send the new patch as soon as all the tests are done.

Here you are. I also simplified the tests a bit having the reference as a 
function
parameter instead of a local one.

Updated ChangeLogs:

*** gcc/ChangeLog ***

2014-05-20  Thomas Preud'homme  

PR tree-optimization/54733
* tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure.
(CMPNOP): Define.
(find_bswap_or_nop_load): New.
(find_bswap_1): Renamed to ...
(find_bswap_or_nop_1): This. Also add support for memory source.
(find_bswap): Renamed to ...
(find_bswap_or_nop): This. Also add support for memory source and
detection of bitwise operations equivalent to load in host endianness.
(execute_optimize_bswap): Likewise. Also move its leading comment back
in place and split statement transformation into ...
(bswap_replace): This.

*** gcc/testsuite/ChangeLog ***

2014-05-20  Thomas Preud'homme  

PR tree-optimization/54733
* gcc.dg/optimize-bswapdi-3.c: New test to check extension of bswap
optimization to support memory sources and bitwise operations
equivalent to load in host endianness.
* gcc.dg/optimize-bswaphi-1.c: Likewise.
* gcc.dg/optimize-bswapsi-2.c: Likewise.
* gcc.c-torture/execute/bswap-2.c: Likewise. 

Best regards,

Thomas

gcc32rm-84.6.0.diff
Description: Binary data

Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard

2014-05-20 Thread David Edelsohn

On Tue, May 20, 2014 at 5:46 PM, Pat Haugen  wrote:
> On 05/19/2014 11:27 PM, Maxim Kuvyrkov wrote:
>>
>> Changes to ia64 and rs6000 are mostly mechanical to update hook return
>> values, but port maintainers may wish to review those.
>
> I'm not a maintainer, but the rs6000 changes look good to me.

The rs6000 changes are okay with me too.

Thanks, David

Re: patch8.diff updated Was: Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-20 Thread Samuel Thibault

Svante Signell, le Fri 16 May 2014 10:03:05 +0200, a écrit :
> is used in gcc-4.9-4.9.0/src/libgo/go/net/fd_unix.go:
> func dupCloseOnExec(fd int) (newfd int, err error) {
> if atomic.LoadInt32(&tryDupCloexec) == 1 && syscall.F_DUPFD_CLOEXEC!=0 {
> r0, _, e1 := syscall.Syscall(syscall.SYS_FCNTL, uintptr(fd),
> syscall.F_DUPFD_CLOEXEC, 0)

That code can not work as it is, fcntl is not a system call on
GNU/Hurd. Why isn't gccgo just using the C fcntl function?  That one
will just work and be portable.

> +# Special treatment of EWOULDBLOCK for GNU/Hurd
> +# /usr/include/bits/errno.h: #define EWOULDBLOCK EAGAIN
> +if egrep 'define EWOULDBLOCK EAGAIN' gen-sysinfo.go > /dev/null 2>&1; then
> +  egrep '^const EWOULDBLOCK = Errno(_EWOULDBLOCK)' ${OUT} | \
> +sed -i.bak -e 's/_EWOULDBLOCK/_EAGAIN/' ${OUT}

I don't understand why you both pass the output of egrep to sed, and you
give the -i option to sed. AIUI, the
egrep '^const EWOULDBLOCK = Errno(_EWOULDBLOCK)'
part is completely unused, so you can just drop it.

> @@ -528,6 +540,12 @@
> +# Special treatment of st_dev for GNU/Hurd
> +# /usr/include/i386-gnu/bits/stat.h: #define st_dev st_fsid
> +if grep 'define st_dev st_fsid' gen-sysinfo.go > /dev/null 2>&1; then
> +  egrep '^type _stat ' gen-sysinfo.go > /dev/null 2>&1| \
> +  sed -i.bak -e 's/; st_fsid/; st_dev/' gen-sysinfo.go
> +fi

The same remark about egrep | sed -i applies here.

And anyway, why not simply using the very first patch you proposed,
which was:

@@ -536,6 +548,7 @@
 fi | sed -e 's/type _stat64/type Stat_t/' \
  -e 's/type _stat/type Stat_t/' \  
  -e 's/st_dev/Dev/' \
+ -e 's/st_fsid/Dev/' \
  -e 's/st_ino/Ino/g' \
  -e 's/st_nlink/Nlink/' \
  -e 's/st_mode/Mode/' \

which I said several times that it should be completely fine.

Samuel

RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-20 Thread Matthew Fortune

Hi Richard,

Apologies for appearing to be pushy on this. I appreciate all of this work
is done in your spare time, I doubt I could manage that. I missed the
important part which was to ask if you had thoughts on the user visible
parts of the patch but you have been through the code now anyway...

Richard Sandiford  writes:
> Matthew Fortune  writes:
> > *) Dwarf debug for 64-bit values in floating point values for FPXX can't
> >be strictly correct for both 32-bit and 64-bit registers but opts to
> >describe one 64-bit register as that is what the FPXX ABI is
> emulating.
> >I have not yet checked what exactly happens in GDB when confronted
> with
> >this and 32-bit registers. This also impacts frame information
> described
> >via mips_save_reg and mips_restore_reg. Advice on this would be
> >appreciated.
> 
> I'm not sure what's best either.  Clearly it's something that needs
> to be spelled out in the ABI, but I can imagine it would be dictated
> by what consumers like the unwinder and gdb find easiest to handle.

I'll be looking into this in detail this week.

> > *) Because GCC can be built to have mfpxx or mfp64 as the default option
> >the ASM_SPEC has to handle these specially such that they are not
> >passed in conjunction with -msingle-float. Depending on how all this
> >option handling pans out then this may also need to account for
> >msoft-float as well. It is an error to have -msoft-float and -mfp64 in
> >the assembler.
> 
> The assembler and GCC shouldn't treat the options differently though.
> Either it should be OK for both or neither.

I wasn't sure if you were applying this rule to options set by --with- at
configure time as well as user supplied options. If I move this logic
to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
-msingle-float is given then that will fix the issue too. Is that OK?

> > @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info,
> const CUMULATIVE_ARGS *cum,
> >  || SCALAR_FLOAT_TYPE_P (type)
> >  || VECTOR_FLOAT_TYPE_P (type))
> >  && (GET_MODE_CLASS (mode) == MODE_FLOAT
> > -|| mode == V2SFmode)
> > +|| (TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode))
> >  && GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
> >break;
> 
> This looks odd.  We shouldn't have V2SF values if there's no ISA support
> for them.

True. This is a safety measure against future vector support. I/We wish to
completely restrict this ABI extension to the paired single-float
hardware feature. This detail is easy to forget, I would like to keep
the check but you get final call of course.
 
> > @@ -5636,7 +5636,7 @@ mips_return_fpr_pair (enum machine_mode mode,
> >  {
> >int inc;
> >
> > -  inc = (TARGET_NEWABI ? 2 : MAX_FPRS_PER_FMT);
> > +  inc = ((TARGET_NEWABI || mips_abi == ABI_32) ? 2 : MAX_FPRS_PER_FMT);
> 
> Formatting nit: no extra brackets here.

Is this a no brackets at all case, or brackets on the condition? There are
both styles in GCC but it is difficult to tell which is most common/correct.

> > @@ -6508,13 +6508,27 @@ mips_output_64bit_xfer (char direction, unsigned
> int gpreg, unsigned int fpreg)
> >if (TARGET_64BIT)
> >  fprintf (asm_out_file, "\tdm%cc1\t%s,%s\n", direction,
> >  reg_names[gpreg], reg_names[fpreg]);
> > -  else if (TARGET_FLOAT64)
> > +  else if (ISA_HAS_MXHC1)
> >  {
> >fprintf (asm_out_file, "\tm%cc1\t%s,%s\n", direction,
> >reg_names[gpreg + TARGET_BIG_ENDIAN], reg_names[fpreg]);
> >fprintf (asm_out_file, "\tm%chc1\t%s,%s\n", direction,
> >reg_names[gpreg + TARGET_LITTLE_ENDIAN], reg_names[fpreg]);
> >  }
> > +  else if (TARGET_FLOATXX && direction == 't')
> > +{
> > +  /* Use the argument save area to move via memory.  */
> > +  fprintf (asm_out_file, "\tsw\t%s,0($sp)\n", reg_names[gpreg]);
> > +  fprintf (asm_out_file, "\tsw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> > +  fprintf (asm_out_file, "\tldc1\t%s,0($sp)\n", reg_names[fpreg]);
> > +}
> > +  else if (TARGET_FLOATXX && direction == 'f')
> > +{
> > +  /* Use the argument save area to move via memory.  */
> > +  fprintf (asm_out_file, "\tsdc1\t%s,0($sp)\n", reg_names[fpreg]);
> > +  fprintf (asm_out_file, "\tlw\t%s,0($sp)\n", reg_names[gpreg]);
> > +  fprintf (asm_out_file, "\tlw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> > +}
> 
> The argument save area might be in use.  E.g. if an argument register
> gets spilled, we'll generally try to spill it to the save area rather
> than create a new stack slot for it.
> 
> This case should always be handled via SECONDARY_MEMORY_NEEDED.

It is handled via SECONDARY_MEMORY_NEEDED for normal code generation. This
function needs a better name! It is solely used as part of generating a
mips16 stub and as such the argument save area is fair game as far as I can
see,

Re: Ping2: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration.

2014-05-20 Thread Eric Botcazou

> Yes, but why is that better than providing them directly?  The latter
> would seem to work better with non-C-family languages like Ada.

That's correct, enumeration types don't have base types in Ada, i.e. they 
are their own base types.  But if the attributes cannot be expressed in DWARF, 
then synthesizing a base type on the fly would probably be OK.

-- 
Eric Botcazou

Re: [PATCH] Implement -fsanitize=float-cast-overflow (take 2)

2014-05-20 Thread Joseph S. Myers

On Tue, 20 May 2014, Marek Polacek wrote:

> * is missing tests for long doubles/-mlong-double-128,

Also missing tests for float - as far as I can see, only double is tested.  
Ideally all of float, double, long double, __float128 (where supported), 
__float80 (where supported) would be tested (the functionality supported 
for __fp16 (ARM) is a bit more restricted) - hopefully using some shared 
macros to avoid too much duplication between tests.

> * doesn't instrument _Decimal to integer conversions yet.

So the code

> +  else if (REAL_MODE_FORMAT (mode)->b == 10)
> +{
> +  /* For _Decimal128 up to 34 decimal digits, - sign,
> +  dot, e, exponent.  */

isn't actually being used yet?

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard

2014-05-20 Thread Pat Haugen


On 05/19/2014 11:27 PM, Maxim Kuvyrkov wrote:

Changes to ia64 and rs6000 are mostly mechanical to update hook return values, 
but port maintainers may wish to review those.

I'm not a maintainer, but the rs6000 changes look good to me.

-Pat

Re: Add flag to optionally ignore ELF interposition

2014-05-20 Thread Joseph S. Myers

On Tue, 20 May 2014, Jan Hubicka wrote:

> Hi,
> as disucssed some time ago, our assumption that every symbol of shared 
> library can
> be interposed at runtime is expensive and prevents a lot of useful 
> optimizations,
> including inlining or IPA propagation.
> 
> While this is useful feature, it is rather incommon to use it for bigger C++
> projects, like firefox and at least clang seems to ignore the ELF 
> interposition
> rules and always inline/propagate. This patch adds flag to control the 
> behaviour.
> Symbols explicitly delcared WEAK are still considered as interposable.

I could see a use for an option listing the symbols that can be 
interposed.  (For example, glibc supports interposition of a limited 
number of malloc-related symbols, but if it were made to support building 
with LTO in future that it would make sense to be able to optimize calls 
to most other functions.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFA: cache enabled attribute by insn code

2014-05-20 Thread Richard Sandiford

Jeff Law  writes:
> On 05/20/14 02:16, Richard Sandiford wrote:
>> get_attr_enabled was showing up high in a -O0 compile of fold-const.ii.
>> At the moment, extract_insn calls this function for every alternative
>> on each extraction, which can be expensive for instructions like
>> moves that have many alternatives.
>>
>> The attribute is only supposed to depend on the insn code and the
>> current target.  It isn't allowed to depend on the values of operands.
>> LRA already makes use of this to cache the enabled attributes based on code.
>>
>> This patch makes the caching compiler-wide.  It also converts the bool
>> array into a plain int bitfield, since at the moment we don't allow more
>> than 30 alternatives.  (It's possible we might want to increase it
>> beyond 32 at some point, but then we can change the type to uint64_t
>> instead.  Wanting to go beyond 64 would suggest deeper problems
>> somewhere. :-))
>>
>> The patch gives a consistent compile-time improvement of about ~3.5%
>> on the -O0 fold-const.ii testcase.
>>
>> There were a few instances of the construct:
>>
>>cl = NO_REGS;
>>for (ignore_p = false, curr_alt = 0;
>> (c = *constraints);
>> constraints += CONSTRAINT_LEN (c, constraints))
>>  if (c == '#' || !recog_data.alternative_enabled_p[curr_alt])
>>ignore_p = true;
>>  else if (c == ',')
>>{
>>  curr_alt++;
>>  ignore_p = false;
>>}
>>  else if (! ignore_p)
>>
>> This had the effect of ignoring all alternatives after the first
>> attribute-disabled one, since once we found a disabled alternative we'd
>> always enter the first arm of the "if" and never increment curr_alt.
>> I don't think that was intentional.
>>
>> Tested on x86_64-linux-gnu.  OK to install?
>>
>> I notice ira_setup_alts is using a HARD_REG_SET to store a mask
>> of alternatives.  If this patch is OK, I'll follow up with a patch
>> to use alternative_mask there too.
>>
>> Thanks,
>> Richard
>>
>>
>> gcc/
>>  * system.h (TEST_BIT): New macro.
>>  * recog.h (alternative_mask): New type.
>>  (ALL_ALTERNATIVES, ALTERNATIVE_BIT): New macros.
>>  (recog_data_d): Replace alternative_enabled_p array with
>>  enabled_alternatives.
>>  (target_recog): New structure.
>>  (default_target_recog, this_target_recog): Declare.
>>  (get_enabled_alternatives): Likewise.
>>  * recog.c (default_target_recog, this_target_recog): New variables.
>>  (get_enabled_alternatives): New function.
>>  (extract_insn): Use it.
>>  (preprocess_constraints, constrain_operands): Adjust for change to
>>  recog_data.
>>  * postreload.c (reload_cse_simplify_operands): Likewise.
>>  * reload.c (find_reloads): Likewise.
>>  * ira-costs.c (record_reg_classes): Likewise.
>>  * ira-lives.c (single_reg_class): Likewise.  Fix bug in which
>>  all alternatives after a disabled one would be skipped.
>>  (ira_implicitly_set_insn_hard_regs): Likewise.
>>  * ira.c (commutative_constraint_p): Likewise.
>>  (ira_setup_alts): Adjust for change to recog_data.
>>  * lra-int.h (lra_insn_recog_data): Replace alternative_enabled_p
>>  with enabled_alternatives.
>>  * lra.c (free_insn_recog_data): Update accordingly.
>>  (lra_update_insn_recog_data): Likewise.
>>  (lra_set_insn_recog_data): Likewise.  Use get_enabled_alternatives.
>>  * lra-constraints.c (process_alt_operands): Likewise.  Handle
>>  only_alternative as part of the enabled mask.
>>  * target-globals.h (this_target_recog): Declare.
>>  (target_globals): Add a recog field.
>>  (restore_target_globals): Restore this_target_recog.
>>  * target-globals.c: Include recog.h.
>>  (default_target_globals): Initialize recog field.
>>  (save_target_globals): Likewise.
> This is OK for the trunk (referring to the follow-up message which fixed 
> EWRONGPATCH.

Sorry, while working on the follow-up LRA patch, I realised I hadn't
accounted for target changes that happen directly via target_reinit
(rather than SWITCHABLE_TARGETS) and cases where reinit_regs is called
to change just the register information.  Both could potentially affect
the enabled attribute.

This version adds a recog_init function that clears the data if necessary.
There are no other changes from first time.  Is this still OK?

Thanks,
Richard


gcc/
* system.h (TEST_BIT): New macro.
* recog.h (alternative_mask): New type.
(ALL_ALTERNATIVES, ALTERNATIVE_BIT): New macros.
(recog_data_d): Replace alternative_enabled_p array with
enabled_alternatives.
(target_recog): New structure.
(default_target_recog, this_target_recog): Declare.
(get_enabled_alternatives, recog_init): Likewise.
* recog.c (default_target_recog, this_target_recog): New variables.
(get_enabled_alternatives): New function.
(extract_insn): Use it.
(recog_init): New function.
(preprocess_constrai

[committed] PR 61243: Missing copy of CROSSING_JUMP_P

2014-05-20 Thread Richard Sandiford

My CROSSING_JUMP_P patch didn't handle insns that were copied by
emit_copy_of_insn_after, which previously would copy REG_CROSSING_JUMP
notes by virtue of them being != REG_LABEL_OPERAND.

This patch adds the missing copy.  Tested with profiledbootstrap on
x86_64-linux-gnu.  Committed as obvious.

Thanks,
Richard


gcc/
PR rtl-optimization/61243
* emit-rtl.c (emit_copy_of_insn_after): Copy CROSSING_JUMP_P.

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2014-05-20 20:20:24.525295969 +0100
+++ gcc/emit-rtl.c  2014-05-20 22:18:18.894580150 +0100
@@ -6027,6 +6027,7 @@ emit_copy_of_insn_after (rtx insn, rtx a
 
 case JUMP_INSN:
   new_rtx = emit_jump_insn_after (copy_insn (PATTERN (insn)), after);
+  CROSSING_JUMP_P (new_rtx) = CROSSING_JUMP_P (insn);
   break;
 
 case DEBUG_INSN:

Re: [PATCH] Fix PR middle-end/61141

2014-05-20 Thread Jeff Law


On 05/19/14 15:20, John David Anglin wrote:

The problem compiling c-common.c is very hard to debug.  These routines
are called intensively and the ICE occurs after several million calls.
It takes a couple of hours of running under gdb to reach the failing
call to reset_insn_used_flags just counting calls up to ICE.
I'd suggest getting a suitable .i/.ii file and debugging it with a 
cross-debugger ;-)  Then again, if you've got a breakpoint that is 
triggering a million times, it may still take hours to hit when 
debugging with a cross.





The problem is more apparent with 64-bit compiler due to way PIC
register is saved and restored.

Yea, that makes sense.





In c#2 you indicate we can get this stuff when an insn in a delay slot
sequence is deleted.  Where/when are those insns being deleted?
Presumably the backends know enough to handle deleted insns in delay
slot sequences?!?  It's been a long time since I looked at that code
in detail, so if you already know it's safe, just say "it's safe" and
I'll believe you :-)  Otherwise a bit of digging may be needed.


I believe that the backend must handle the deleted insns in the delay
slot as there are are no compilation errors or regressions.
However, like you, I'm not 100% certain this done correctly.
I'm pretty sure we're getting this wrong in the backend.   In fact, the 
more I think about it, a NOTE_INSN_DELETED in a delay slot is just 
asking for all kinds of trouble.


In general, if we have an insn with a filled delay slot, then we will 
emit the two insns independently of each other (most of the time, there 
are exceptions).


A NOTE_INSN_DELETED in a delay slot still looks like a filled slot.  So 
the target code isn't going to emit a NOP or anything like that.  It's 
going to leave it up to the generic code to emit the insn with the delay 
slot and the delay slot insn itself (the NOTE_INSN_DELETED in this case).


Of course we don't output anything for a NOTE_INSN_DELETED.  So the net 
result is we, in essence, fill the delay slot with whatever random 
instruction happens to fall next in the insn chain.


Amazingly, nothing seems to be failing, but I've seen far worse bugs go 
unnoticed for a long time.


Sadly,I think we need to start digging deeper to find out what's 
deleting those insns and take corrective action.


Perhaps writing a little routine to peek at all the filled delay slots 
and squawk if it finds a NOTE_INSN_DELETED.  Then call it after each RTL 
pass that follows reorg in the pass manager.  That'd at least narrow it 
down to a pass that's mucking things up.



Something has changed.  Either these insns aren't being removed from RTL
chain as they were before or they are being deleted
at a much later stage.  The same code is present in 4.9 and the problem
doesn't seem to occur there.  Possibly, a regression search
would pin point this.

Probably just gone latent.



Still, we shouldn't call reset_insn_used_flags  or verify_insn_sharing
with a note.
Agreed, but at least in this instance, it's the symptom of a deeper 
problem I think.


jeff

Re: Eliminate write-only variables

2014-05-20 Thread Martin Jambor

Hi,

On Fri, May 16, 2014 at 07:25:59PM +0200, Jan Hubicka wrote:
> Hi,
> this patch adds code to remove write only static variables.  While analyzing
> effectivity of LTO on firefox, I noticed that surprisingly large part of
> binary's data segment is occupied by these.  Fixed thus.
> (this is quite trivial transformation, I just never considered it important
> enough to work on it).
> 
> The patch goes by marking write only variables in ipa.c (at same time we
> discover addressable flag) and also fixes handling of the flags for
> aliases. References to variables are then removed by fixup_cfg.
> As first cut, I only remove stores without side effects, so copies from
> volatile variables are preserved. I also kill LHS of function calls.
> I do not attempt to remove asm statements.  This means that some references
> may be left in the code and therefore the IPA code does not eliminate the
> referneces after discovering write only variable and instead it relies
> on dead variable elimination to do the job later.  Consequently not all write
> only variables are removed with WHOPR in the case the references ends up
> in different partitions. Something I can address incrementally.
> 
> Also I think dwarf2out should be updated to mark value of the write only
> variables as optimized out.  Jakub, can you help me with this?
> (I do not think it is valid to output the optimized out value of constructor)
> 
> Bootstrapped/regtested x86_64-linux, will commit it later today.
> 
> Honza
> 
>   * varpool.c (dump_varpool_node): Dump write-only flag.
>   * lto-cgraph.c (lto_output_varpool_node, input_varpool_node): Stream
>   write-only flag.
>   * tree-cfg.c (execute_fixup_cfg): Remove statements setting write-only 
> variables.
> 
> 
>   * gcc.c-torture/execute/20101011-1.c: Update testcase.
>   * gcc.dg/ira-shrinkwrap-prep-1.c: Update testcase.
>   * gcc.dg/tree-ssa/writeonly.c: New testcase.
>   * gcc.dg/tree-ssa/ssa-dse-6.c: Update testcase.
>   * gcc.dg/tree-ssa/pr21559.c: Update testcase.
>   * gcc.dg/debug/pr35154.c: Update testcase.
>   * gcc.target/i386/vectorize1.c: Update testcase.
>   * ipa.c (process_references): New function.
>   (set_readonly_bit): New function.
>   (set_writeonly_bit): New function.
>   (clear_addressable_bit): New function.
>   (ipa_discover_readonly_nonaddressable_var): Mark write only variables; 
> fix
>   handling of aliases.
>   * cgraph.h (struct varpool_node): Add writeonly flag.
> 

Unfortunately, this commit has caused the following ICE for me when
LTO building 471.omnetpp from SPEC 2006 or Firefox (but not libxul,
something that gets built earlier):

lto1: internal compiler error: in gimple_get_virt_method_for_vtable,
at gimple-fold.c:3276
0x7437a3 gimple_get_virt_method_for_vtable(long, tree_node*, unsigned
long, bool*)  
/home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3276  
0x743993 gimple_get_virt_method_for_binfo(long, tree_node*, bool*)
/home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3377  
0x7913f2 possible_polymorphic_call_targets(tree_node*, long,
ipa_polymorphic_call_context, bool*, void**, int*)  
/home/mjambor/gcc/bisect/src/gcc/ipa-devirt.c:1697  
0x7b73a9 possible_polymorphic_call_targets  
/home/mjambor/gcc/bisect/src/gcc/ipa-utils.h:121  
0x7b73a9 walk_polymorphic_call_targets  
/home/mjambor/gcc/bisect/src/gcc/ipa.c:177  
0x7b73a9 symtab_remove_unreachable_nodes(bool, _IO_FILE*)  
/home/mjambor/gcc/bisect/src/gcc/ipa.c:407  
0x86bf47 execute_todo  
/home/mjambor/gcc/bisect/src/gcc/passes.c:1843  
Please submit a full bug report...
 
I compile omnetpp with -Ofast -g -flto=8 -fwhole-program
-funroll-loops -fpeel-loops -march=native -mtune=native

I'm afraid I won't be able to prepare a more reduced testcase very
soon.

Thanks,

Martin

[msp430] fix addneghi and add split

2014-05-20 Thread DJ Delorie


Minor bugfixes.  Committed to head and 4.9 branch.

* config/msp430/msp430.md (split): Don't allow subregs when
splitting SImode adds.
(andneghi): Fix subtraction logic.
* config/msp430/predicates.md (msp430_nonsubreg_or_imm_operand): New.

Index: config/msp430/predicates.md
===
--- config/msp430/predicates.md (revision 210652)
+++ config/msp430/predicates.md (working copy)
@@ -70,11 +70,15 @@
 || INTVAL (op) == ~8
 || INTVAL (op) == ~(-1) "
 
 (define_predicate "msp430_nonsubreg_operand"
   (match_code "reg,mem"))
 
+(define_predicate "msp430_nonsubreg_or_imm_operand"
+  (ior (match_operand 0 "msp430_nonsubreg_operand")
+   (match_operand 0 "immediate_operand")))
+
 ; TRUE for constants which are bit positions for zero_extract
 (define_predicate "msp430_bitpos"
   (and (match_code "const_int")
(match_test ("   INTVAL (op) >= 0
 && INTVAL (op) <= 15 "
Index: config/msp430/msp430.md
===
--- config/msp430/msp430.md (revision 210652)
+++ config/msp430/msp430.md (working copy)
@@ -359,14 +359,14 @@
 
 ; Split an SImode add into two HImode adds, keeping track of the carry
 ; so that gcc knows when it can and can't optimize away the two
 ; halves.
 (define_split
   [(set (match_operand:SI  0 "msp430_nonsubreg_operand")
-   (plus:SI (match_operand:SI 1 "nonimmediate_operand")
-(match_operand:SI 2 "general_operand")))
+   (plus:SI (match_operand:SI 1 "msp430_nonsubreg_operand")
+(match_operand:SI 2 "msp430_nonsubreg_or_imm_operand")))
]
   ""
   [(parallel [(set (match_operand:HI 3 "nonimmediate_operand" "=&rm")
   (plus:HI (match_dup 4)
(match_dup 5)))
  (set (reg:BI CARRY)
@@ -1314,15 +1314,15 @@
   [(set (match_operand:HI 0 "register_operand" "=r")
(and:HI (neg:HI (match_operand:HI 1 "register_operand"  "r"))
(match_operand2 "immediate_operand" "n")))]
   ""
   "*
 if (REGNO (operands[0]) != REGNO (operands[1]))
-  return \"MOV.W\t%1, %0 { SUB.W\t#0, %0 { AND.W\t%2, %0\";
+  return \"MOV.W\t%1, %0 { INV.W\t%0 { INC.W\t%0 { AND.W\t%2, %0\";
 else
-  return \"SUB.W\t#0, %0 { AND.W\t%2, %0\";
+  return \"INV.W\t%0 { INC.W\t%0 { AND.W\t%2, %0\";
   "
   )
 
 (define_insn "mulhisi3"
   [(set (match_operand:SI  0 "register_operand" "=r")
(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" "%0"))

Re: [PATCH, PR C++/61038] - g++ -E is unusable with UDL strings

2014-05-20 Thread Jason Merrill


On 05/13/2014 08:59 PM, Ed Smith-Rowland wrote:

+  escape_it = escape_it || cpp_userdef_string_p (token->type)
+   || cpp_userdef_char_p (token->type);


Let's add the new cases to the previous statement instead of a new one. 
 OK with that change.


Jason

Re: [PATCH] dump_case_nodes: Treat unsigned as unsigned, don't ICE

2014-05-20 Thread Mike Stump

On May 20, 2014, at 10:58 AM, Segher Boessenkool  
wrote:
> The current code converts every tree to signed hwi; this ICEs with
> values not representable as shwi, like 999ULL

Looks nice to me…  Looks like the type of change I would have done for 
wide-int, if I had tripped on this.

Re: [patch] libstdc++/61143 make unordered containers usable after move

2014-05-20 Thread François Dumont


On 20/05/2014 21:36, Jonathan Wakely wrote:


OK. My sketch above avoided calling _M_moved_from() more than once per
object, but the compiler should be able to optimise your version to
avoid multiple calls anyway.


Here is the new patch limited to what I really want to commit this time.


Great. Please commit to trunk and the 4.9 branch - thanks!


As I have integrated your remarks I won't have time to commit this 
evening. So I submit this update here, just in case you want to see it 
again and I will commit tomorrow.


I finally use a simplified version of your sketch in the swap 
implementation.


François

Index: include/bits/hashtable.h
===
--- include/bits/hashtable.h	(revision 210657)
+++ include/bits/hashtable.h	(working copy)
@@ -316,14 +316,49 @@
   size_type			_M_element_count;
   _RehashPolicy		_M_rehash_policy;
 
+  // A single bucket used when only need for 1 bucket. Especially
+  // interesting in move semantic to leave hashtable with only 1 buckets
+  // which is not allocated so that we can have those operations noexcept
+  // qualified.
+  // Note that we can't leave hashtable with 0 bucket without adding
+  // numerous checks in the code to avoid 0 modulus.
+  __bucket_type		_M_single_bucket;
+
+  bool
+  _M_uses_single_bucket(__bucket_type* __bkts) const
+  { return __builtin_expect(_M_buckets == &_M_single_bucket, false); }
+
+  bool
+  _M_uses_single_bucket() const
+  { return _M_uses_single_bucket(_M_buckets); }
+
   __hashtable_alloc&
   _M_base_alloc() { return *this; }
 
-  using __hashtable_alloc::_M_deallocate_buckets;
+  __bucket_type*
+  _M_allocate_buckets(size_type __n)
+  {
+	if (__builtin_expect(__n == 1, false))
+	  {
+	_M_single_bucket = nullptr;
+	return &_M_single_bucket;
+	  }
 
+	return __hashtable_alloc::_M_allocate_buckets(__n);
+  }
+
   void
+  _M_deallocate_buckets(__bucket_type* __bkts, size_type __n)
+  {
+	if (_M_uses_single_bucket(__bkts))
+	  return;
+
+	__hashtable_alloc::_M_deallocate_buckets(__bkts, __n);
+  }
+
+  void
   _M_deallocate_buckets()
-  { this->_M_deallocate_buckets(_M_buckets, _M_bucket_count); }
+  { _M_deallocate_buckets(_M_buckets, _M_bucket_count); }
 
   // Gets bucket begin, deals with the fact that non-empty buckets contain
   // their before begin node.
@@ -703,11 +738,7 @@
 
   size_type
   erase(const key_type& __k)
-  {
-	if (__builtin_expect(_M_bucket_count == 0, false))
-	  return 0;
-	return _M_erase(__unique_keys(), __k);
-  }
+  { return _M_erase(__unique_keys(), __k); }
 
   iterator
   erase(const_iterator, const_iterator);
@@ -768,7 +799,7 @@
   _M_rehash_policy()
 {
   _M_bucket_count = _M_rehash_policy._M_next_bkt(__bucket_hint);
-  _M_buckets = this->_M_allocate_buckets(_M_bucket_count);
+  _M_buckets = _M_allocate_buckets(_M_bucket_count);
 }
 
   template_M_allocate_buckets(_M_bucket_count);
+	_M_buckets = _M_allocate_buckets(_M_bucket_count);
 	__try
 	  {
 	for (; __f != __l; ++__f)
@@ -833,9 +864,9 @@
 	  {
 		// Replacement allocator cannot free existing storage.
 		this->_M_deallocate_nodes(_M_begin());
-		if (__builtin_expect(_M_bucket_count != 0, true))
-		  _M_deallocate_buckets();
-		_M_reset();
+		_M_before_begin._M_nxt = nullptr;
+		_M_deallocate_buckets();
+		_M_buckets = nullptr;
 		std::__alloc_on_copy(__this_alloc, __that_alloc);
 		__hashtable_base::operator=(__ht);
 		_M_bucket_count = __ht._M_bucket_count;
@@ -867,7 +898,7 @@
 	if (_M_bucket_count != __ht._M_bucket_count)
 	  {
 	__former_buckets = _M_buckets;
-	_M_buckets = this->_M_allocate_buckets(__ht._M_bucket_count);
+	_M_buckets = _M_allocate_buckets(__ht._M_bucket_count);
 	_M_bucket_count = __ht._M_bucket_count;
 	  }
 	else
@@ -885,8 +916,7 @@
 		  [&__roan](const __node_type* __n)
 		  { return __roan(__n->_M_v()); });
 	if (__former_buckets)
-	  this->_M_deallocate_buckets(__former_buckets,
-	  __former_bucket_count);
+	  _M_deallocate_buckets(__former_buckets, __former_bucket_count);
 	  }
 	__catch(...)
 	  {
@@ -917,7 +947,7 @@
   {
 	__bucket_type* __buckets = nullptr;
 	if (!_M_buckets)
-	  _M_buckets = __buckets = this->_M_allocate_buckets(_M_bucket_count);
+	  _M_buckets = __buckets = _M_allocate_buckets(_M_bucket_count);
 
 	__try
 	  {
@@ -964,8 +994,9 @@
 _M_reset() noexcept
 {
   _M_rehash_policy._M_reset();
-  _M_bucket_count = 0;
-  _M_buckets = nullptr;
+  _M_bucket_count = 1;
+  _M_single_bucket = nullptr;
+  _M_buckets = &_M_single_bucket;
   _M_before_begin._M_nxt = nullptr;
   _M_element_count = 0;
 }
@@ -980,12 +1011,16 @@
 _M_move_assign(_Hashtable&& __ht, std::true_type)
 {
   this->_M_deallocate_nodes(_M_begin());
-  if (__builtin_expect(_M_bucket_c

Re: [PATCH, rs6000] Fix HTM __builtin_ttest rtl expansion

2014-05-20 Thread David Edelsohn

On Tue, May 20, 2014 at 3:28 PM, Peter Bergner  wrote:
> The following patch fixes a semi-latent bug for the HTM pattern used with
> the __builtin_ttest() builtin.  This is supposed to expand to a tabortwci.
> instruction which sets cr0 and then some code that copies the cr0 value
> into a gpr and then shifts and masks it into the lowest 2 bits in the gpr.
>
> With newish -mcpu targets, we generate a "mfocrf rX,128" to copy the cr0
> value into bits 32-35 of the gpr.  However, in some cases, we will instead
> generate a "mfcr rX" instruction, which copies all 8 CR fields into the
> gpr, with cr0 again being in bits 32-35.  The difference is that the
> mfocrf instruction seems to duplicate the 4-bit CR field in the gpr,
> so bits 36-39 are a duplicate of bits 32-35.  The error in the ttest
> pattern is that we only shift the value down 24 bits.  This "works"
> when a mfocrf copied the cr0 into the rgeister due the duplicated CR
> field, but is wrong when a mfcr instruction is used.
>
> The following passed bootstrap and regtesting on powerpc64-linux.
> Ok for trunk?  Is this also ok for the 4.9 and 4.8 branches once
> my bootstraps and regtesting is done there?
>
> Peter
>
>
> gcc/
> * config/rs6000/htm.md (ttest): Use correct shift value to get CR0.
>
> gcc/testsuite/
> * gcc.target/powerpc/htm-ttest.c: New test.

Okay.

Thanks, David

Re: [PATCH] Implement -fsanitize=float-cast-overflow (take 2)

2014-05-20 Thread Marek Polacek

Thanks for all your help.  This is updated patch which:

* uses mix/min handling that Jakub kindly wrote,
* limited a test to ilp32 || lp64 + used sse2_runtime (but it's likely
  I screwed it up as usually),
* adds a testcase for (unsigned) __int128,
* adds testcases for C and C++ bit-fields,
* makes -fsanitize=float-cast-overflow be not enabled by default,
* is tested with -m32/-m64 on x86_64/ppc64,
* is missing tests for long doubles/-mlong-double-128,
* doesn't instrument _Decimal to integer conversions yet.

Comments?

2014-05-20  Marek Polacek  
Jakub Jelinek  

* builtins.def: Change SANITIZE_FLOAT_DIVIDE to SANITIZE_NONDEFAULT.
* gcc.c (sanitize_spec_function): Likewise.
* convert.c (convert_to_integer): Include "ubsan.h".  Add floating-point
to integer instrumentation.
* doc/invoke.texi: Document -fsanitize=float-cast-overflow.
* flag-types.h (enum sanitize_code): Add SANITIZE_FLOAT_CAST and
SANITIZE_NONDEFAULT.
* opts.c (common_handle_option): Handle -fsanitize=float-cast-overflow.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW,
BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW_ABORT): Add.
* ubsan.c: Include "realmpfr.h" and "dfp.h".
(get_ubsan_type_info_for_type): Handle REAL_TYPEs.
(ubsan_instrument_float_cast): New function.
* ubsan.h (ubsan_instrument_float_cast): Declare.

* c-c++-common/ubsan/float-cast-overflow-1.c: New test.
* c-c++-common/ubsan/float-cast-overflow-2.c: New test.
* g++.dg/ubsan/float-cast-overflow-bf.C: New test.
* gcc.dg/ubsan/float-cast-overflow-bf.c: New test.

diff --git gcc/builtins.def gcc/builtins.def
index d400ecb..cd823a3 100644
--- gcc/builtins.def
+++ gcc/builtins.def
@@ -176,7 +176,7 @@ along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
   true, true, true, ATTRS, true, \
  (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
-   | SANITIZE_UNDEFINED | SANITIZE_FLOAT_DIVIDE)))
+   | SANITIZE_UNDEFINED | SANITIZE_NONDEFAULT)))
 
 #undef DEF_CILKPLUS_BUILTIN
 #define DEF_CILKPLUS_BUILTIN(ENUM, NAME, TYPE, ATTRS)  \
diff --git gcc/convert.c gcc/convert.c
index 91c1da2..b8f3671 100644
--- gcc/convert.c
+++ gcc/convert.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-core.h"
 #include "target.h"
 #include "langhooks.h"
+#include "ubsan.h"
 
 /* Convert EXPR to some pointer or reference type TYPE.
EXPR must be pointer, reference, integer, enumeral, or literal zero;
@@ -394,6 +395,7 @@ convert_to_integer (tree type, tree expr)
   tree intype = TREE_TYPE (expr);
   unsigned int inprec = element_precision (intype);
   unsigned int outprec = element_precision (type);
+  location_t loc = EXPR_LOCATION (expr);
 
   /* An INTEGER_TYPE cannot be incomplete, but an ENUMERAL_TYPE can
  be.  Consider `enum E = { a, b = (enum E) 3 };'.  */
@@ -844,7 +846,17 @@ convert_to_integer (tree type, tree expr)
   return build1 (CONVERT_EXPR, type, expr);
 
 case REAL_TYPE:
-  return build1 (FIX_TRUNC_EXPR, type, expr);
+  if (flag_sanitize & SANITIZE_FLOAT_CAST)
+   {
+ expr = save_expr (expr);
+ tree check = ubsan_instrument_float_cast (loc, type, expr);
+ expr = build1 (FIX_TRUNC_EXPR, type, expr);
+ if (check == NULL)
+   return expr;
+ return fold_build2 (COMPOUND_EXPR, TREE_TYPE (expr), check, expr);
+   }
+  else
+   return build1 (FIX_TRUNC_EXPR, type, expr);
 
 case FIXED_POINT_TYPE:
   return build1 (FIXED_CONVERT_EXPR, type, expr);
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 5b1b0f1..659c265 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -5427,6 +5427,13 @@ Detect floating-point division by zero.  Unlike other 
similar options,
 @option{-fsanitize=undefined}, since floating-point division by zero can
 be a legitimate way of obtaining infinities and NaNs.
 
+@item -fsanitize=float-cast-overflow
+@opindex fsanitize=float-cast-overflow
+
+This option enables floating-point type to integer conversion checking.
+We check that the result of the conversion does not overflow.
+This option does not work well with @code{FE_INVALID} exceptions enabled.
+
 @item -fsanitize-recover
 @opindex fsanitize-recover
 By default @option{-fsanitize=undefined} sanitization (and its suboptions
diff --git gcc/flag-types.h gcc/flag-types.h
index caf4039..ed00046 100644
--- gcc/flag-types.h
+++ gcc/flag-types.h
@@ -229,9 +229,11 @@ enum sanitize_code {
   SANITIZE_BOOL = 1 << 10,
   SANITIZE_ENUM = 1 << 11,
   SANITIZE_FLOAT_DIVIDE = 1 << 12,
+  SANITIZE_FLOAT_CAST = 1 << 13,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
   | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
-

Re: add dbgcnt and opt-info support for devirtualization

2014-05-20 Thread Xinliang David Li

On Tue, May 20, 2014 at 4:32 AM, Richard Biener
 wrote:
> On Mon, May 19, 2014 at 5:24 PM, Xinliang David Li  wrote:
>> Sorry about it. Here is the patch.  There is one remaining case where
>> cgraph_dump_file and dump_enable_p are checked separately --
>> cgraph_dump_file is set up differently from 'dump_file'.
>
> But there you check with an else if, so if you do -fdump-ipa-cgraph
> then suddenly -fopt-info will stop reporting?  At least in the cgraphunit.c
> part of the patch.

Right. Fixed.

>
> I'm ok with the rest of the patch.

I checked in the patch with the addition fix.

thanks,

David

>
> Thanks,
> Richard.
>
>> David
>>
>>
>>
>> On Mon, May 19, 2014 at 2:21 AM, Richard Biener
>>  wrote:
>>> On Fri, May 16, 2014 at 11:19 PM, Xinliang David Li  
>>> wrote:
 Modified the patch according to yours and Richard's feedback. PTAL.
>>>
>>> ENOPATCH.
>>>
>>> Btw, I don't see any issue with leaking node order to opt-report.
>>>
>>> Richard.
>>>
 thanks,

 David

 On Fri, May 16, 2014 at 9:03 AM, Jan Hubicka  wrote:
>> Hi, debugging runtime bugs due to devirtualization can be hard for
>> very large C++ programs with complicated class hierarchy. This patch
>> adds the support to report this high level transformation via
>> -fopt-info (not hidden inside dump file) and the ability the do binary
>> search with cutoff.
>>
>> Ok for trunk after build and test?
>
> Seems resonable to me.
>>
>> thanks,
>>
>> David
>
>> Index: ChangeLog
>> ===
>> --- ChangeLog (revision 210479)
>> +++ ChangeLog (working copy)
>> @@ -1,3 +1,18 @@
>> +2014-05-15  Xinliang David Li  
>> +
>> + * cgraphunit.c (walk_polymorphic_call_targets): Add
>> + dbgcnt and fopt-info support.
>> + 2014-05-15  Xinliang David Li  
>> +
>> + * cgraphunit.c (walk_polymorphic_call_targets): Add
>> + dbgcnt and fopt-info support.
>> + * ipa-prop.c (ipa_make_edge_direct_to_target): Ditto.
>> + * ipa-devirt.c (ipa_devirt): Ditto.
>> + * ipa.c (walk_polymorphic_call_targets): Ditto.
>> + * gimple-fold.c (fold_gimple_assign): Ditto.
>> + (gimple_fold_call): Ditto.
>> + * dbgcnt.def: New counter.
>> +
>>  2014-05-15  Martin Jambor  
>>
>>   PR ipa/61085
>> Index: ipa-prop.c
>> ===
>> --- ipa-prop.c(revision 210479)
>> +++ ipa-prop.c(working copy)
>> @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.
>>  #include "ipa-utils.h"
>>  #include "stringpool.h"
>>  #include "tree-ssanames.h"
>> +#include "dbgcnt.h"
>>
>>  /* Intermediate information about a parameter that is only useful 
>> during the
>> run of ipa_analyze_node and is not kept afterwards.  */
>> @@ -2494,6 +2495,13 @@ ipa_make_edge_direct_to_target (struct c
>>   fprintf (dump_file, "ipa-prop: Discovered direct call to 
>> non-function"
>>   " in %s/%i, making it unreachable.\n",
>>ie->caller->name (), ie->caller->order);
>> +  else if (dump_enabled_p ())
>> + {
>> +   location_t loc = gimple_location (ie->call_stmt);
>> +   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
>> +"Discovered direct call to non-function in 
>> %s, "
>> +"making it unreachable\n", ie->caller->name 
>> ());
>
> Perhaps "turning it to __builtin_unreachable call" and similarly in the 
> other cases
> we introduce __builtin_unreachable? I think that could be easier for user 
> to work
> out.
>
> What king of problems in devirtualizatoin you are seeing?
>
>
> Honza

Add flag to optionally ignore ELF interposition

2014-05-20 Thread Jan Hubicka

Hi,
as disucssed some time ago, our assumption that every symbol of shared library 
can
be interposed at runtime is expensive and prevents a lot of useful 
optimizations,
including inlining or IPA propagation.

While this is useful feature, it is rather incommon to use it for bigger C++
projects, like firefox and at least clang seems to ignore the ELF interposition
rules and always inline/propagate. This patch adds flag to control the 
behaviour.
Symbols explicitly delcared WEAK are still considered as interposable.

Bootstrapped/regtested x86_64-linux, will commit it tomorrow if there are
no complains. (Feedback is welcome!)

Honza

* doc/invoke.texi (-fsemantic-interposition): New flag.
* common.opt (fsemantic-interposition): Use it.
* varasm.c (decl_replaceable_p): Use it.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 210653)
+++ doc/invoke.texi (working copy)
@@ -411,6 +411,7 @@ Objective-C and Objective-C++ Dialects}.
 -fschedule-insns -fschedule-insns2 -fsection-anchors @gol
 -fselective-scheduling -fselective-scheduling2 @gol
 -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops @gol
+-fsemantic-interposition @gol
 -fshrink-wrap -fsignaling-nans -fsingle-precision-constant @gol
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
@@ -7709,6 +7710,22 @@ This option has no effect unless one of
 When pipelining loops during selective scheduling, also pipeline outer loops.
 This option has no effect unless @option{-fsel-sched-pipelining} is turned on.
 
+@item -fsemantic-interposition
+@opindex fsemantic-interposition
+Some object formats, like ELF, allow interposing of symbols by dynamic linker.
+This means that for symbols exported from the DSO compiler can not perform
+inter-procedural propagation, inlining and other optimizations in anticipation
+that the function or variable in question may change. While this feature is
+useful, for example, to rewrite memory allocation functions by a debugging
+implementation, it is expensive in the terms of code quality.
+With @option{-fno-semantic-inteposition} compiler assumest that if 
interposition
+happens for functions the overwritting function will have
+precisely same semantics (and side effects). Similarly if interposition happens
+for variables, the constructor of the variable will be the same. The flag
+has no effect for functions explicitly declared inline, where
+interposition changing semantic is never allowed and for symbols explicitly
+declared weak.
+
 @item -fshrink-wrap
 @opindex fshrink-wrap
 Emit function prologues only before parts of the function that need it,
Index: common.opt
===
--- common.opt  (revision 210653)
+++ common.opt  (working copy)
@@ -1854,6 +1854,10 @@ fsel-sched-reschedule-pipelined
 Common Report Var(flag_sel_sched_reschedule_pipelined) Init(0) Optimization
 Reschedule pipelined regions without pipelining
 
+fsemantic-interposition
+Common Report Var(flag_semantic_interposition) Init(1)
+Allow interposing function (or variables) by ones with different semantics (or 
initializer) respectively by dynamic linker
+
 ; sched_stalled_insns means that insns can be moved prematurely from the queue
 ; of stalled insns into the ready list.
 fsched-stalled-insns
Index: varasm.c
===
--- varasm.c(revision 210654)
+++ varasm.c(working copy)
@@ -6870,6 +6870,9 @@ decl_replaceable_p (tree decl)
   gcc_assert (DECL_P (decl));
   if (!TREE_PUBLIC (decl) || DECL_COMDAT (decl))
 return false;
+  if (!flag_semantic_interposition
+  && !DECL_WEAK (decl))
+return false;
   return !decl_binds_to_current_def_p (decl);
 }

Re: Ping2: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration.

2014-05-20 Thread Jason Merrill


On 05/20/2014 01:51 PM, Mark Wielaard wrote:

The DWARF part isn't what this patch is blocked on. That has already
been discussed on the DWARF standard list, coordinated with the gdb
hackers and approved some months ago.


Fair enough.


The part that hasn't been reviewed
and approved yet is the frontend changes that implement the lang-hook.


Rather than define the hook for C, let's have a default version that 
uses the type_for_size langhook; that should work better for Ada.


Jason

Re: [patch ping] libstdc++ testsuite cxxflags

2014-05-20 Thread Cesar Philippidis

On 05/20/2014 02:11 AM, Jonathan Wakely wrote:
> On 19/05/14 14:57 -0600, Sandra Loosemore wrote:
>> On 05/17/2014 04:07 AM, Jonathan Wakely wrote:
>>> On 17 May 2014 10:50, Jonathan Wakely wrote:
 On 17 May 2014 01:16, Sandra Loosemore wrote:
> It appears that this patch from last fall never got reviewed.
>
> https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02340.html
>
> Can someone take a look?  I'll commit the patch on Cesar's behalf
> if it's
> approved.

 Libstdc++ patches need to go to the libstdc++ list, which this did, in
 a separate mail that broke the threading:
 https://gcc.gnu.org/ml/libstdc++/2013-10/msg00224.html
 Then archives's inability to thread betweem months broke it again:
 https://gcc.gnu.org/ml/libstdc++/2013-11/msg00113.html

 I approved it then withdrew that approval:
 https://gcc.gnu.org/ml/libstdc++/2013-11/msg00120.html
 then the patch got revised:
 https://gcc.gnu.org/ml/libstdc++/2013-11/msg00122.html

 I'll have to refresh my memory about it.
>>
>> Whoops, I totally missed that there was already a separate thread on
>> the libstdc++ mailing list only.  My bad.  :-(
>>
>>> I think I'm happiest with the second version of the patch, in
>>> https://gcc.gnu.org/ml/libstdc++/2013-11/msg00114.html
>>>
>>> It does mean a change that might affect people using CXXFLAGS when
>>> running the tests, so we might want to update
>>> https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html where it says
>>> "Or, just run the testsuites with CXXFLAGS set to -D_GLIBCXX_DEBUG or
>>> -D_GLIBCXX_PARALLEL."
>>
>> I came up with the attached patch for the wording change.  I'm having
>> trouble regenerating the HTML version of the manual, though; it looks
>> like I have a different version of the DocBook stylesheets around that
>> are introducing lots of extraneous changes, and I'm not sure what the
>> "right" version is.  :-S  Any suggestions?
> 
> You always get hundreds of changes, DocBook generates unique numeric
> id attributes, which are different every run. Don't worr yabout the
> docs, I can sort them out. If you and Cesar are happy with the patch
> in https://gcc.gnu.org/ml/libstdc++/2013-11/msg00114.html then please
> go ahead and commit that version, thanks.

Looking back at my notes, this patch addresses the libstdc++ atomics
test failures when using a custom site.exp. Without the -O2 flag, those
tests would fail to link because of the dependency on libatomic.

I'm happy with the second patch. Sandra please commit it.

Thanks,
Cesar

Re: [patch] libstdc++/61143 make unordered containers usable after move

2014-05-20 Thread Jonathan Wakely


On 19/05/14 22:27 +0200, François Dumont wrote:

On 15/05/2014 22:52, Jonathan Wakely wrote:

Does this get initialized in the constructors?
Would it make sense to give it an initializer?

__bucket_type_M_single_bucket = nullptr;


   This bucket is replacing those normally allocated and when they 
are allocated they are 0 initialised. So, you were right, there were 
one place where this initialization was missing which is fixed in this 
new patch. So I don't think this additional initialization is 
necessary.


OK


@@ -980,12 +999,16 @@
   _M_move_assign(_Hashtable&& __ht, std::true_type)
   {
 this->_M_deallocate_nodes(_M_begin());
-  if (__builtin_expect(_M_bucket_count != 0, true))
-_M_deallocate_buckets();
-
+  _M_deallocate_buckets();
 __hashtable_base::operator=(std::move(__ht));
 _M_rehash_policy = __ht._M_rehash_policy;
-  _M_buckets = __ht._M_buckets;
+  if (__builtin_expect(__ht._M_buckets != 
&__ht._M_single_bucket, true))

+_M_buckets = __ht._M_buckets;


What is the value of this->_M_single_bucket now?

Should it be set to nullptr, if only to help debugging?


We are not resetting buckets to null when rehashing so unless I add 
more checks I won't be able to reset it each time.


OK



+  if (__builtin_expect(__ht._M_buckets == 
&__ht._M_single_bucket, false))


This check appears in a few places, I wonder if it is worth creating a
private member function to hide the details:

bool _M_moved_from() const noexcept
{
  return __builtin_expect(_M_buckets == &_M_single_bucket, false);
}

Then just test if (__ht._M_moved_from())

Usually I would think the __builtin_expect should not be inside the
member function, so the caller decides what the expected result is,
but I think in all cases the result is expected to be false. That
matches how move semantics are designed: the object that gets moved
from is expected to be going out of scope, and so will be reused in a
minority of cases.


@@ -1139,7 +1170,14 @@
   {
 if (__ht._M_node_allocator() == this->_M_node_allocator())
   {
-  _M_buckets = __ht._M_buckets;
+  if (__builtin_expect(__ht._M_buckets == 
&__ht._M_single_bucket, false))


This could be if (__ht._M_moved_from())


I hesitated in doing so and finally do so. I only prefer 
_M_use_single_bucket as we might not limit its usage to moved 
instances.


Good point.

I think it should be called _M_uses_single_bucket() or
_M_using_single_bucket() though, otherwise it sounds more like it
answers the question "should I use a single bucket?" rather than "am I
using the single bucket?"


How about removing the std::swap(_M_buckets, __x._M_buckets) above and
doing (untested):

if (this->_M_moved_from())
  {
if (__x._M_moved_from())
  _M_buckets = &_M_single_bucket;
else
  _M_buckets = __x._M_buckets;
__x._M_buckets = &__x._M_single_bucket;
  }
else if (__x._M_moved_from())
  {
__x._M_buckets = _M_buckets;
_M_buckets = &_M_single_bucket;
  }
else
  std::swap(_M_buckets, __x._M_buckets);

Is that worth it?  I'm not sure.


Yes, with the newly introduced _M_use_single_bucket it worth it I 
think and is what I have done.


OK. My sketch above avoided calling _M_moved_from() more than once per
object, but the compiler should be able to optimise your version to
avoid multiple calls anyway.


Here is the new patch limited to what I really want to commit this time.


Great. Please commit to trunk and the 4.9 branch - thanks!

[PATCH, rs6000] Fix HTM __builtin_ttest rtl expansion

2014-05-20 Thread Peter Bergner

The following patch fixes a semi-latent bug for the HTM pattern used with
the __builtin_ttest() builtin.  This is supposed to expand to a tabortwci.
instruction which sets cr0 and then some code that copies the cr0 value
into a gpr and then shifts and masks it into the lowest 2 bits in the gpr.

With newish -mcpu targets, we generate a "mfocrf rX,128" to copy the cr0
value into bits 32-35 of the gpr.  However, in some cases, we will instead
generate a "mfcr rX" instruction, which copies all 8 CR fields into the
gpr, with cr0 again being in bits 32-35.  The difference is that the
mfocrf instruction seems to duplicate the 4-bit CR field in the gpr,
so bits 36-39 are a duplicate of bits 32-35.  The error in the ttest
pattern is that we only shift the value down 24 bits.  This "works"
when a mfocrf copied the cr0 into the rgeister due the duplicated CR
field, but is wrong when a mfcr instruction is used.

The following passed bootstrap and regtesting on powerpc64-linux.
Ok for trunk?  Is this also ok for the 4.9 and 4.8 branches once
my bootstraps and regtesting is done there?

Peter


gcc/
* config/rs6000/htm.md (ttest): Use correct shift value to get CR0.

gcc/testsuite/
* gcc.target/powerpc/htm-ttest.c: New test.

Index: gcc/config/rs6000/htm.md
===
--- gcc/config/rs6000/htm.md(revision 210518)
+++ gcc/config/rs6000/htm.md(working copy)
@@ -179,7 +179,7 @@ (define_expand "ttest"
 (const_int 0)]
UNSPECV_HTM_TABORTWCI))
(set (subreg:CC (match_dup 2) 0) (match_dup 1))
-   (set (match_dup 3) (lshiftrt:SI (match_dup 2) (const_int 24)))
+   (set (match_dup 3) (lshiftrt:SI (match_dup 2) (const_int 28)))
(parallel [(set (match_operand:SI 0 "int_reg_operand" "")
   (and:SI (match_dup 3) (const_int 15)))
   (clobber (scratch:CC))])]
Index: gcc/testsuite/gcc.target/powerpc/htm-ttest.c
===
--- gcc/testsuite/gcc.target/powerpc/htm-ttest.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/htm-ttest.c(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mhtm" } */
+
+/* { dg-final { scan-assembler "rlwinm r?\[0-9\]+,r?\[0-9\]+,3,30,31" { target 
{ ilp32 } } } } */
+/* { dg-final { scan-assembler "rldicl r?\[0-9\]+,r?\[0-9\]+,35,62" { target { 
lp64 } } } } */
+
+#include 
+long
+ttest (void)
+{
+  return _HTM_STATE(__builtin_ttest());
+}

Re: [PATCH] Fix PR61221

2014-05-20 Thread Jeff Law


On 05/20/14 12:13, Richard Biener wrote:


I think it's still important for things like keeping SSA names
densely allocated and for reuse on SSA rewrite of virtual operands
for example. So I'm thinking of an explicit stop-/restart-reuse
interface.
Fair enough.  WRT dense allocation, we could probably do a lot better if 
we allocated some N nodes at a time rather than one at a time on demand 
when the freelist is empty.  When we allocate those N nodes, obviously 
N-1 go onto the freelist.


It'd be interesting to see the cache effects of doing something like 
that ;-)


jeff

Re: [PATCH] Fix PR61221

2014-05-20 Thread Richard Biener

On May 20, 2014 6:47:44 PM CEST, Jeff Law  wrote:
>On 05/20/14 02:06, Richard Biener wrote:
>>
>> We're still going to have problems if SSA names are re-used
>> (as even released SSA names have to remain valid lattice
>> entries after this).  But currently nothing will create new
>> SSA names (fingers crossing) during eliminate.  A followup
>> patch of mine will though, thus I'll do that two-staged
>> SSA name release/reuse list you thought would maybe fix this
>> issue.
>Sounds good (the two-staged release/reuse).
>
>The other thing I had pondered was two modes.  One with immediate
>re-use 
>and one with the two-staged release/reuse.
>
>But before doing anything my plan was to revisit the need for the 
>release/reuse code.  That code was written when we had to go completely
>
>out of SSA, then back into SSA in the first incarnation of the SSA jump
>
>threading support.
>
>With the incremental updating of the SSA graph, we may find that we 
>really just don't need this code anymore.  I wouldn't lose any sleep if
>
>after a bit of poking you came to that conclusion and just ripped out 
>the release/reuse code completely.

I think it's still important for things like keeping SSA names densely 
allocated and for reuse on SSA rewrite of virtual operands for example. So I'm 
thinking of an explicit stop-/restart-reuse interface.

Richard.

>jeff

Re: RFA: cache recog_op_alt by insn code

2014-05-20 Thread Jeff Law


On 05/20/14 02:19, Richard Sandiford wrote:

Following on from (and depending on) the last patch, process_constraints
also shows up high in the profile.  This patch caches the recog_op_alt
information by insn code too.  It also shrinks the size of the structure
from 1 pointer + 5 ints to 1 pointer + 2 ints:

- no target should have more than 65536 register classes :-)

That could probably be much lower if we needed more bits for other things.


- "reject" is based on a cost of 600 for '!', so 16 bits should be plenty

OK.  Makes sense.



- "matched" and "matches" are operand numbers and so fit in 8 bits
OK.  This could also be smaller, don't we have an upper limit of 32 (or 
30?) on the number of operands appearing in an insn.  It'd be a way to 
get a few more bits if we need them someday.



Since this code is creating cached data and should only be run once
per insn code or asm statement, I don't think the extra in-loop
lra_static_insn_data assignments really justify having two copies
of such a complicated function.  I think it would be better for LRA
to use the generic routine and then fill in the lra_static_insn_data
fields on the result (basically just an OR of "is_address" and
"early_clobber" for each alternative, plus setting up "commutative").
We could then avoid having two caches of the same data.  I'll do
that as a follow-up if it sounds OK.
Seems reasonable.  I suspect Vlad found the same code to be hot, hence 
the caching.  Once he had a copy adding to it wasn't a big deal.  But 
yes, I think that after the cache is globalized that having IRA walk 
over things another time shouldn't be a problem.





On the subject of commutativity, we have:

 static bool
 commutative_constraint_p (const char *str)
 {
   int curr_alt, c;
   bool ignore_p;

   for (ignore_p = false, curr_alt = 0;;)
 {
   c = *str;
   if (c == '\0')
 break;
   str += CONSTRAINT_LEN (c, str);
   if (c == '#' || !recog_data.alternative_enabled_p[curr_alt])
 ignore_p = true;
   else if (c == ',')
 {
   curr_alt++;
   ignore_p = false;
 }
   else if (! ignore_p)
 {
   /* Usually `%' is the first constraint character but the
  documentation does not require this.  */
   if (c == '%')
 return true;
 }
 }
   return false;
 }

Any objections to requiring `%' to be the first constraint character?
Seems wasteful to be searching the constraint string just to support
an odd case.
If we're going to change it, then clearly the docs need to change and 
ideally we'd statically check the port's constraints during the build 
process to ensure they meet the tighter definition.


I'm sure David will be oh-so-happy to see state appearing.  But that's 
something he'll have to deal with in the JIT side.




The patch gives a further ~3.5% improvement in compile time for
-O0 fold-const.ii, on top of the other patch.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* recog.h (operand_alternative): Convert reg_class, reject,
matched and matches into bitfields.
(preprocess_constraints): Take the insn as parameter.
(recog_op_alt): Change into an array of pointers.
(target_recog): Add x_op_alt.
* recog.c (asm_op_alt_1, asm_op_alt): New variables
(recog_op_alt): Change into an array of pointers.
(preprocess_constraints): Update accordingly.  Take the insn as
parameter.  Use asm_op_alt_1 and asm_op_alt for asms.  Cache other
instructions in this_target_recog.
* ira-lives.c (process_bb_node_lives): Pass the insn to
process_constraints.
* reg-stack.c (check_asm_stack_operands): Likewise.
(subst_asm_stack_regs): Likewise.
* regcprop.c (copyprop_hardreg_forward_1): Likewise.
* regrename.c (build_def_use): Likewise.
* sched-deps.c (sched_analyze_insn): Likewise.
* sel-sched.c (get_reg_class, implicit_clobber_conflict_p): Likewise.
* config/arm/arm.c (xscale_sched_adjust_cost): Likewise.
(note_invalid_constants): Likewise.
* config/i386/i386.c (ix86_legitimate_combined_insn): Likewise.


OK.

Thanks!

Jeff

[PATCH] dump_case_nodes: Treat unsigned as unsigned, don't ICE

2014-05-20 Thread Segher Boessenkool

The current code converts every tree to signed hwi; this ICEs with
values not representable as shwi, like 999ULL in
pr34154.c (and if it didn't ICE, it would print the wrong value).

This fixes it.  Bootstrapped and tested on powerpc64-linux.  Okay
to apply?


Segher


2014-05-20  Segher Boessenkool  

gcc/
* stmt.c (dump_case_nodes): Don't convert values to HOST_WIDE_INT
before printing.

---
 gcc/stmt.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/gcc/stmt.c b/gcc/stmt.c
index 163d495..722d34f 100644
--- a/gcc/stmt.c
+++ b/gcc/stmt.c
@@ -774,24 +774,20 @@ static void
 dump_case_nodes (FILE *f, struct case_node *root,
 int indent_step, int indent_level)
 {
-  HOST_WIDE_INT low, high;
-
   if (root == 0)
 return;
   indent_level++;
 
   dump_case_nodes (f, root->left, indent_step, indent_level);
 
-  low = tree_to_shwi (root->low);
-  high = tree_to_shwi (root->high);
-
   fputs (";; ", f);
-  if (high == low)
-fprintf (f, "%*s" HOST_WIDE_INT_PRINT_DEC,
-indent_step * indent_level, "", low);
-  else
-fprintf (f, "%*s" HOST_WIDE_INT_PRINT_DEC " ... " HOST_WIDE_INT_PRINT_DEC,
-indent_step * indent_level, "", low, high);
+  fprintf (f, "%*s", indent_step * indent_level, "");
+  print_dec (root->low, f, TYPE_SIGN (TREE_TYPE (root->low)));
+  if (!tree_int_cst_equal (root->low, root->high))
+{
+  fprintf (f, " ... ");
+  print_dec (root->high, f, TYPE_SIGN (TREE_TYPE (root->high)));
+}
   fputs ("\n", f);
 
   dump_case_nodes (f, root->right, indent_step, indent_level);
-- 
1.8.1.4

Re: [Google/4_8] LIPO COMDAT profile fixups

2014-05-20 Thread Teresa Johnson

On Tue, May 20, 2014 at 8:39 AM, Xinliang David Li  wrote:
> On Tue, May 20, 2014 at 6:32 AM, Teresa Johnson  wrote:
>> On Mon, May 19, 2014 at 11:51 PM, Xinliang David Li  
>> wrote:
>>> Why duplicating the merger functions in dyn-ipa.c? Should those in
>>> libgcov-merge.c be reused?
>>
>> The merger functions in libgcov-merge.c use a macro to either read the
>> counter to merge from a buffer in memory (when IN_GCOV_TOOL) or from
>> the gcda file otherwise. In this case I need to merge from another
>> array, and it needs to do so both IN_GCOV_TOOL and otherwise. So to
>> reuse the merger functions I would have had to get the counter value
>> from a new argument, guarded by a conditional. I didn't want to change
>> the interface of those existing merge functions, or more importantly,
>> make them less efficient with the added condition since they are
>> invoked frequently during gcda file merging.
>>
>
> ok -- something to think about in the future.
>
> For now, please add a comment in libgcov-utils.c before
> ctr_merge_functions to point to the dup merge functions in dyn-ipa.c.

Done.

>
>
>>>
>>> The refactoring of gcov_exit_write_gcda should probably be done in a
>>> separate patch -- preferably submitted to trunk too.
>>
>> The refactoring is necessary for this patch since I need to invoke the
>> outlined functionality in gcov_dump_module_info as well.
>>
>> I could submit that to trunk, but it has to go in to the google
>> branches either with or preceeding this patch.
>>
>
> ok for google branch after fixing the formatting issues

Fixed the issues below and others uncovered by contrib/check_GNU_style.sh.

Patch committed to google/4_8 as r210652.

Will port to google/4_9.

Teresa

>
> 1) calller --> caller
> 2) many places where there is missing space: e.g,
> gcov_fixup_counters_checksum( ..
>
> David
>
>> Thanks,
>> Teresa
>>
>>>
>>> David
>>>
>>> On Mon, May 19, 2014 at 10:08 PM, Teresa Johnson  
>>> wrote:
 Ping.
 Teresa

 On Wed, May 14, 2014 at 4:39 PM, Teresa Johnson  
 wrote:
> This patch applies profile fixups to COMDATs on the dyn ipa callgraph
> at the end of LIPO module grouping (either in the profile gen run or
> in gcov-tool). This is to address issues with missing profiles in the
> out-of-line COMDAT copies not selected by the linker, and indirect
> call profiles that target a module not included in the module group.
> By default, both fixups are enabled, but can be controlled by a
> profile-gen parameter, an environment variable, and a gcov-tool
> option.
>
> The fixups assume that functions with the same lineno and cfg checksum
> are copies of the same COMDAT. This is made more likely by ensuring
> that in LIPO mode we include the full mangled name in the
> lineno_checksum.
>
> For the counter fixup, we merge all non-zero profiles with matching
> checksums and copy the merged profile into copies with all-zero
> profiles.
>
> For the indirect counter fixup, if an indirect call profile target is
> not in the module group, we look for a matching checksum copy in the
> primary module and if exactly one is found we change the target to
> that.
>
> If any fixups are applied, the gcda files are rewritten after module 
> grouping.
>
> This also required a couple of other changes to the optimizer. During
> cgraph node resolution, we were arbitrarily selecting a copy that had
> non-zero profiles. Now there are many more to choose from, so we will
> prefer the primary module copy if it is non-zero. Also, during cloning
> for inlining, we only want to update the profile on the callee node if
> we are inlining into the resolved node caller node. We were already
> doing this for AutoFDO, and need to do this here now that many node
> copies have the same profile.
>
> Patch attached. Tested with regression tests and internal benchmarks.
> Ok for google branches?
>
> Thanks,
> Teresa
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: Ping2: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration.

2014-05-20 Thread Mark Wielaard

On Tue, May 20, 2014 at 10:43:22AM -0400, Jason Merrill wrote:
> On 05/20/2014 02:55 AM, Mark Wielaard wrote:
> >On Mon, May 19, 2014 at 04:50:35PM -0400, Jason Merrill wrote:
> >>On 05/13/2014 03:21 AM, Mark Wielaard wrote:
> >>>So the debugger doesn't have to guess the properties of the enum's
> >>>underlying base type, like size, encoding and signedness.
> >>
> >>Well, the enum already has DW_AT_byte_size.  It seems to me that it should
> >>also have DW_AT_encoding to provide the other two properties you mention.
> >
> >Right, that is the idea, since an enum doesn't provide those attributes,
> >it should have a reference to the underlying base type that provides them.
> 
> Yes, but why is that better than providing them directly?

Because that would loose the information of which underlying type
is used to implement the enum. And it is probably less efficient to
replicate all the attributes for each enum. It is also simply how the
DWARF standard specifies these properties. They are not attached to the
DW_TAG_enumeration_type directly but specified through the underlying
type. Which is what gdb expects now.

The DWARF part isn't what this patch is blocked on. That has already
been discussed on the DWARF standard list, coordinated with the gdb
hackers and approved some months ago. The part that hasn't been reviewed
and approved yet is the frontend changes that implement the lang-hook.

Thanks,

Mark

Re: RFA: cache enabled attribute by insn code

2014-05-20 Thread Jeff Law


On 05/20/14 02:16, Richard Sandiford wrote:

get_attr_enabled was showing up high in a -O0 compile of fold-const.ii.
At the moment, extract_insn calls this function for every alternative
on each extraction, which can be expensive for instructions like
moves that have many alternatives.

The attribute is only supposed to depend on the insn code and the
current target.  It isn't allowed to depend on the values of operands.
LRA already makes use of this to cache the enabled attributes based on code.

This patch makes the caching compiler-wide.  It also converts the bool
array into a plain int bitfield, since at the moment we don't allow more
than 30 alternatives.  (It's possible we might want to increase it
beyond 32 at some point, but then we can change the type to uint64_t
instead.  Wanting to go beyond 64 would suggest deeper problems
somewhere. :-))

The patch gives a consistent compile-time improvement of about ~3.5%
on the -O0 fold-const.ii testcase.

There were a few instances of the construct:

   cl = NO_REGS;
   for (ignore_p = false, curr_alt = 0;
(c = *constraints);
constraints += CONSTRAINT_LEN (c, constraints))
 if (c == '#' || !recog_data.alternative_enabled_p[curr_alt])
   ignore_p = true;
 else if (c == ',')
   {
curr_alt++;
ignore_p = false;
   }
 else if (! ignore_p)

This had the effect of ignoring all alternatives after the first
attribute-disabled one, since once we found a disabled alternative we'd
always enter the first arm of the "if" and never increment curr_alt.
I don't think that was intentional.

Tested on x86_64-linux-gnu.  OK to install?

I notice ira_setup_alts is using a HARD_REG_SET to store a mask
of alternatives.  If this patch is OK, I'll follow up with a patch
to use alternative_mask there too.

Thanks,
Richard


gcc/
* system.h (TEST_BIT): New macro.
* recog.h (alternative_mask): New type.
(ALL_ALTERNATIVES, ALTERNATIVE_BIT): New macros.
(recog_data_d): Replace alternative_enabled_p array with
enabled_alternatives.
(target_recog): New structure.
(default_target_recog, this_target_recog): Declare.
(get_enabled_alternatives): Likewise.
* recog.c (default_target_recog, this_target_recog): New variables.
(get_enabled_alternatives): New function.
(extract_insn): Use it.
(preprocess_constraints, constrain_operands): Adjust for change to
recog_data.
* postreload.c (reload_cse_simplify_operands): Likewise.
* reload.c (find_reloads): Likewise.
* ira-costs.c (record_reg_classes): Likewise.
* ira-lives.c (single_reg_class): Likewise.  Fix bug in which
all alternatives after a disabled one would be skipped.
(ira_implicitly_set_insn_hard_regs): Likewise.
* ira.c (commutative_constraint_p): Likewise.
(ira_setup_alts): Adjust for change to recog_data.
* lra-int.h (lra_insn_recog_data): Replace alternative_enabled_p
with enabled_alternatives.
* lra.c (free_insn_recog_data): Update accordingly.
(lra_update_insn_recog_data): Likewise.
(lra_set_insn_recog_data): Likewise.  Use get_enabled_alternatives.
* lra-constraints.c (process_alt_operands): Likewise.  Handle
only_alternative as part of the enabled mask.
* target-globals.h (this_target_recog): Declare.
(target_globals): Add a recog field.
(restore_target_globals): Restore this_target_recog.
* target-globals.c: Include recog.h.
(default_target_globals): Initialize recog field.
(save_target_globals): Likewise.
This is OK for the trunk (referring to the follow-up message which fixed 
EWRONGPATCH.



jeff

Re: [C++ Patch/RFC] PR 58753 & 58930

2014-05-20 Thread Jason Merrill


OK.

Jason

Re: [PATCH, AArch64] Fix for PR61202

2014-05-20 Thread Carrot Wei

Hi James

Thank you for pointing this out. In the new patch I removed the
modification of vqdmulh_n_s32 and vqdmulhq_n_s32.

Passed dejagnu testing on aarch64 qemu again. OK for trunk, 4.9 and 4.8?

2014-05-20  Guozhi Wei  

* config/aarch64/arm_neon.h (vqdmulh_n_s16): Change
the last operand's constraint.
(vqdmulhq_n_s16): Likewise.

On Mon, May 19, 2014 at 11:50 PM, James Greenhalgh
 wrote:
> On Tue, May 20, 2014 at 07:18:40AM +0100, Carrot Wei wrote:
>> Hi
>
> Hi,
>
>> The last operand of instruction sqdmulh can only be low fp registers,
>> so we should use constraint "x". But the intrinsic functions use "w".
>> This patch fixed the constrains in these intrinsics.
>
> This restriction is only on the _s16 variants of the intrinsics. From the
> ARMv8 Architecture Reference Manual:
>
>Is the name of the second SIMD&FP source register,
>   [...]
>   Restricted to V0-V15 when element size  is H.
>
> The patch is correct (though I can't approve it) for vqdmulh_n_s16 and
> vqdmulhq_n_s16, but the two hunks for vqdmulh_n_s32 and vqdmulhq_n_s32 should
> be dropped as they are too restrictive.
>
> Thanks,
> James
>
>> 2014-05-19  Guozhi Wei  
>>
>> * config/aarch64/arm_neon.h (vqdmulh_n_s16): Change
>> the last operand's constraint.
>> (vqdmulh_n_s32): Likewise.
>> (vqdmulhq_n_s16): Likewise.
>> (vqdmulhq_n_s32): Likewise.
>


patch
Description: Binary data

Re: [PATCH v2] Replace C/C++ void_zero_node with a VOID_CST tree code

2014-05-20 Thread Jason Merrill


OK.  I've committed my patch.

Jason

Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard

2014-05-20 Thread Jeff Law


On 05/19/14 22:27, Maxim Kuvyrkov wrote:

Hi,

This patch cleans up haifa-sched.c:choose_ready() function while
allow more powerful customization by backends at the same time.

The primary change is that
targetm.sched.first_cycle_multipass_dfa_lookahead_guard hook is
converted from returning a boolean OK/Nada value to returning an
action to be applied to a given instruction in the ready list.

If return value is N == 0, then insn is OK (previous corresponding
return value was 1). If return value is N  < 0, then insn should not
be considered for scheduling for -N cycles (I'm going to use this for
ARM scheduling in an upcoming patch). If return value is N  > 0, then
insn can still be issued on the current cycle, just not immediately
now (previous corresponding return value was 0).

Changes to ia64 and rs6000 are mostly mechanical to update hook
return values, but port maintainers may wish to review those.

Tested on x86_64-linux-gnu, arm-linux-gnueabihf, and by building
ia64-linux-gnu cc1.  Changes to PowerPC were seriously eyeballed.  If
maintainers so to wish, I will carry out any additional testing.

This looks OK to me.

I think it's safe to work from the assumption that the hook will never 
return INT_MIN :-)


jeff

Re: [PATCH] Regression fix for PR target/61223

2014-05-20 Thread Yury Gribov


> For now, please revert your original patch

Alex asked me to revert patch for him. Done in r210650.

-Y

Re: [GCC RFC]A new and simple pass merging paired load store instructions

2014-05-20 Thread Jeff Law


On 05/20/14 11:14, Wei Mi wrote:

On Tue, May 20, 2014 at 12:13 AM, Bin.Cheng  wrote:

On Tue, May 20, 2014 at 1:30 AM, Jeff Law  wrote:

On 05/19/14 00:38, Bin.Cheng wrote:


On Sat, May 17, 2014 at 12:32 AM, Jeff Law  wrote:


On 05/16/14 04:07, Bin.Cheng wrote:



But can't you go through movXX to generate either the simple insn on the
ARM
or the PARALLEL on the thumb?


Yes, I think it's more than upsizing the mode.  There is another
example from one of x86's candidate peephole patch at
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html

The patch wants to do below transformation, which I think is very
target dependent.


Presumably there's no way to go through an expander here?

I don't know very much with respect to this case, maybe the patch
author can help here.



I just checked the expand result for my case. TER could make expand
see the two intrinsics at the same time so in theory it is possible to
make the merge happen in expand. I havn't looked into detail.

I'm not referring to in the gimple->rtl expansion.

I'm referring to using a define_expand to generate the load/store 
multiple instructions.  Given define_expand in the backend, the target 
independent code can use HAVE_XXX and gen_XXX to test for the existence 
of the expander and to call the expander to generate insns.


Jeff

Re: [GCC RFC]A new and simple pass merging paired load store instructions

2014-05-20 Thread Wei Mi

On Tue, May 20, 2014 at 12:13 AM, Bin.Cheng  wrote:
> On Tue, May 20, 2014 at 1:30 AM, Jeff Law  wrote:
>> On 05/19/14 00:38, Bin.Cheng wrote:
>>>
>>> On Sat, May 17, 2014 at 12:32 AM, Jeff Law  wrote:

 On 05/16/14 04:07, Bin.Cheng wrote:

 But can't you go through movXX to generate either the simple insn on the
 ARM
 or the PARALLEL on the thumb?

>>> Yes, I think it's more than upsizing the mode.  There is another
>>> example from one of x86's candidate peephole patch at
>>> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html
>>>
>>> The patch wants to do below transformation, which I think is very
>>> target dependent.
>>
>> Presumably there's no way to go through an expander here?
> I don't know very much with respect to this case, maybe the patch
> author can help here.
>

I just checked the expand result for my case. TER could make expand
see the two intrinsics at the same time so in theory it is possible to
make the merge happen in expand. I havn't looked into detail.

Thanks,
Wei.

Re: [PATCH] Fix ICE in rtl-optimization/PR61220, PR61225

2014-05-20 Thread Jeff Law


On 05/20/14 01:11, Zhenqiang Chen wrote:

Hi,

The patch fix ICE issue triggered by shrink-wrapping enhancement.

Bootstrap and no make check regression on X86-64.

OK for trunk?

Thanks!
-Zhenqiang


2014-05-20  Zhenqiang Chen  

 PR rtl-optimization/61220
 Part of PR rtl-optimization/61225
 * shrink-wrap.c (move_insn_for_shrink_wrap): Skip SP and FP adjustment
 insn; skip split_edge for a block without only on successor.

testsuite/ChangeLog:
2014-05-20  Zhenqiang Chen  

 * gcc.dg/pr61220.c: New test.
 * gcc.dg/shrink-wrap-loop.c: Disable for x86_64 -m32 mode.


[ ... ]




/* Make sure that the source register isn't defined later in BB.  */
@@ -204,6 +209,10 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn,
/* Create a new basic block on the edge.  */
if (EDGE_COUNT (next_block->preds) == 2)
  {
+  /* split_edge for a block with only one successor is meaningless.  */
+  if (EDGE_COUNT (bb->succs) == 1)
+   return false;
+
next_block = split_edge (live_edge);

So don't you just want to use:

if (EDGE_CRITICAL_P (live_edge))

to replace the two explicit checks of EDGE_COUNT on the preds & succs?

jeff

Commit: MSP430: Fix some gcc testsuite failures

2014-05-20 Thread Nick Clifton

Hi Guys,

  I am checking in the patch below (to both the mainline and the 4.9
  branch) in order to fix some unexpected failures in the gcc testsuite
  with the msp430-elf toolchain.  With this patch in place the following
  tests now pass:

gcc.c-torture/execute/builtins/memcpy-chk.c
gcc.c-torture/execute/builtins/memmove.c
gcc.c-torture/execute/builtins/mempcpy-chk.c
gcc.c-torture/execute/builtins/mempcpy.c
gcc.c-torture/execute/builtins/strcpy-chk.c
gcc.c-torture/execute/builtins/strcpy.c
gcc.c-torture/execute/20020412-1.c
gcc.c-torture/execute/20050826-2.c
gcc.c-torture/execute/20050826-2.c
gcc.c-torture/execute/920908-1.c
gcc.c-torture/execute/pr56205.c
gcc.c-torture/execute/strct-varg-1.c
gcc.c-torture/execute/va-arg-22.c

  No regressions are introduced by the patch.

  The patch fixes two bugs - the first was that the zero_extendpsisi2
  pattern was confusing gcc about the number of hard registers needed to
  hold an SImode value.

  The second fixes a problem with expanding va_arg when pointers are not
  a multiple of the word size.

Cheers
  Nick

gcc/ChangeLog
2014-05-20  Nick Clifton  

* config/msp430/msp430.c (TARGET_GIMPLIFY_VA_ARG_EXPR): Define.
(msp430_gimplify_va_arg_expr): New function.
(msp430_print_operand): Handle (CONST (ZERO_EXTRACT)).

* config/msp430/msp430.md (zero_extendpsisi2): Use + constraint on
operand 0 in order to prevent confusion about the number of
registers involved.



msp430.patch.xz
Description: application/xz

Re: [GCC RFC]A new and simple pass merging paired load store instructions

2014-05-20 Thread Jeff Law


On 05/20/14 01:13, Bin.Cheng wrote:


The idea being that common cases where a pair moves can be turned into a
single wider move without having to write target code to make that happen
much of the time.  ie 2xQI->HI, 2xHI->SI, 2xSI->DI 2xSF->DF.  For things
outside those simple cases, fall back to a target hook or a target expander.

This is practicable, but the question is why we would bother with
general cases if the hook interface is needed anyway.  Is it because
target calls are generally more expensive?
No, it's because most (all?) targets would benefit without having to 
write target dependent code.


jeff

[PATCH v2] Replace C/C++ void_zero_node with a VOID_CST tree code

2014-05-20 Thread Richard Sandiford

This is an updated version of the patch to replace void_zero_node with
void_node.  The main controversial point/hack in the original was that
it handled void_node in the gimplifier as a zero constant.  Jason sent
me a patch that traps the dummy object in the front end, meaning that
it no longer escapes to the gimplifier.

I tested the combination of both patches on x86_64-linux-gnu.  I think
Richard has already approved the middle-end bits, but OK for C++?
Jason, should I commit your patch or will you?

Thanks,
Richard


gcc/
* tree.def (VOID_CST): New.
* tree-core.h (TI_VOID): New.
* tree.h (void_node): New.
* tree.c (tree_node_structure_for_code, tree_code_size)
(iterative_hash_expr): Handle VOID_CST.
(build_common_tree_nodes): Initialize void_node.

gcc/c-family/
* c-common.h (CTI_VOID_ZERO, void_zero_node): Delete.
* c-common.c (c_common_nodes_and_builtins): Don't initialize
void_zero_node.
* c-pretty-print.c (pp_c_void_constant): New function.
(c_pretty_printer::constant, c_pretty_printer::primary_expression)
(c_pretty_printer::expression): Handle VOID_CST.
* cilk.c (extract_free_variables): Likewise.
* c-ubsan.c (ubsan_instrument_division, ubsan_instrument_shift)
(ubsan_instrument_vla): Use void_node instead of void_zero_node.

gcc/c/
* c-array-notation.c (expand_array_notations): Use void_node
instead of void_zero_node.

gcc/cp/
* cvt.c (convert_to_void): Use void_node instead of void_zero_node.
* cp-array-notation.c (replace_invariant_exprs): Likewise.
(expand_array_notation): Handle VOID_CST.
* error.c (dump_expr): Likewise.
* cxx-pretty-print.c (cxx_pretty_printer::primary_expression)
(cxx_pretty_printer::expression): Likewise.
(pp_cxx_new_expression): Use void_node instead of void_zero_node.
* decl.c (register_dtor_fn): Likewise.
* init.c (build_raw_new_expr, build_new_1, build_vec_init)
(build_delete, push_base_cleanups): Likewise.
* mangle.c (write_expression): Likewise.
* semantics.c (finish_break_stmt, empty_expr_stmt_p): Likewise.
* pt.c (tsubst_decl, tsubst_copy_and_build): Likewise.
(tsubst, tsubst_copy, build_non_dependent_expr): Handle VOID_CST.
* tree.c (cp_tree_equal): Likewise.
(build_dummy_object, is_dummy_object, stabilize_expr): Use void_node
instead of void_zero_node.
* typeck.c (check_return_expr): Likewise.
* typeck2.c (build_functional_cast): Likewise.

Index: gcc/tree.def
===
--- gcc/tree.def2014-05-19 11:15:16.580535297 +0100
+++ gcc/tree.def2014-05-20 13:49:09.137734209 +0100
@@ -257,6 +257,8 @@ DEFTREECODE (LANG_TYPE, "lang_type", tcc
 
 /* First, the constants.  */
 
+DEFTREECODE (VOID_CST, "void_cst", tcc_constant, 0)
+
 /* Contents are in an array of HOST_WIDE_INTs.
 
We often access these constants both in their native precision and
Index: gcc/tree-core.h
===
--- gcc/tree-core.h 2014-05-20 13:48:56.299756356 +0100
+++ gcc/tree-core.h 2014-05-20 13:49:09.135734213 +0100
@@ -410,6 +410,8 @@ enum tree_index {
   TI_UINT32_TYPE,
   TI_UINT64_TYPE,
 
+  TI_VOID,
+
   TI_INTEGER_ZERO,
   TI_INTEGER_ONE,
   TI_INTEGER_THREE,
Index: gcc/tree.h
===
--- gcc/tree.h  2014-05-20 13:48:56.299756356 +0100
+++ gcc/tree.h  2014-05-20 13:49:09.138734208 +0100
@@ -3245,6 +3245,8 @@ #define uint16_type_node  global_trees[T
 #define uint32_type_node   global_trees[TI_UINT32_TYPE]
 #define uint64_type_node   global_trees[TI_UINT64_TYPE]
 
+#define void_node  global_trees[TI_VOID]
+
 #define integer_zero_node  global_trees[TI_INTEGER_ZERO]
 #define integer_one_node   global_trees[TI_INTEGER_ONE]
 #define integer_three_node  global_trees[TI_INTEGER_THREE]
Index: gcc/tree.c
===
--- gcc/tree.c  2014-05-20 13:48:56.299756356 +0100
+++ gcc/tree.c  2014-05-20 13:49:09.137734209 +0100
@@ -383,6 +383,7 @@ tree_node_structure_for_code (enum tree_
   switch (code)
 {
   /* tcc_constant cases.  */
+case VOID_CST: return TS_TYPED;
 case INTEGER_CST:  return TS_INT_CST;
 case REAL_CST: return TS_REAL_CST;
 case FIXED_CST:return TS_FIXED_CST;
@@ -652,6 +653,7 @@ tree_code_size (enum tree_code code)
 case tcc_constant:  /* a constant */
   switch (code)
{
+   case VOID_CST:  return sizeof (struct tree_typed);
case INTEGER_CST:   gcc_unreachable ();
case REAL_CST:  return sizeof (struct tree_real_cst);
case FIXED_CST: retu

Re: [PATCH] Fix PR61221

2014-05-20 Thread Jeff Law


On 05/20/14 02:06, Richard Biener wrote:


We're still going to have problems if SSA names are re-used
(as even released SSA names have to remain valid lattice
entries after this).  But currently nothing will create new
SSA names (fingers crossing) during eliminate.  A followup
patch of mine will though, thus I'll do that two-staged
SSA name release/reuse list you thought would maybe fix this
issue.

Sounds good (the two-staged release/reuse).

The other thing I had pondered was two modes.  One with immediate re-use 
and one with the two-staged release/reuse.


But before doing anything my plan was to revisit the need for the 
release/reuse code.  That code was written when we had to go completely 
out of SSA, then back into SSA in the first incarnation of the SSA jump 
threading support.


With the incremental updating of the SSA graph, we may find that we 
really just don't need this code anymore.  I wouldn't lose any sleep if 
after a bit of poking you came to that conclusion and just ripped out 
the release/reuse code completely.


jeff

Re: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect Cilk_sync usage

2014-05-20 Thread Jeff Law


On 05/20/14 08:10, Zamyatin, Igor wrote:

Please look then on the following patch.

Regtested successfully on x86_64.
Is it ok for trunk and 4.9?

gcc/cp/ChangeLog:

2014-05-20  Igor Zamyatin  

PR c/60189
* parser.c (cp_parser_postfix_expression): Move handling of cilk_sync
from here to...
(cp_parser_statement): ...here. Make sure only semicolon can go after
Cilk_sync.

gcc/testsuite/ChangeLog:

2014-05-20  Igor Zamyatin  

PR c++/60189
* c-c++-common/cilk-plus/CK/invalid_sync.cс: New test.

This is fine for both the trunk and the 4.9 branch.

Thanks,
Jeff

Re: -fuse-caller-save - Collect register usage information

2014-05-20 Thread Eric Botcazou

> The test in get_call_reg_set_usage for flag_use_caller_save and the hook is
> strictly speaking not necessary. But it's the interface function to retrieve
> the collected register usage information, so it seems a good location to do
> an early-out. I've left it in for now.

But the test for targetm.call_fusage_contains_non_callee_clobbers seems to be 
useless in practice if -fuse-caller-save isn't the default, so I'd remove it.

OK with that change and a head comment for get_call_cgraph_node, unless Jan 
wants to move get_call_fndecl and get_call_cgraph_node to cgraph.c.

-- 
Eric Botcazou

Re: [PATCH, PR61191, Cilk+] Fix ICE on syntax error

2014-05-20 Thread Jeff Law


On 05/20/14 09:27, H.J. Lu wrote:

On Tue, May 20, 2014 at 7:31 AM, Zamyatin, Igor  wrote:

Hi all!

The following patch fixes the ICE for the cilk code with syntax error.

Regtested on x86_64.
Ok for trunk and 4.9?

Thanks,
Igor


gcc/c/ChangeLog:

2014-05-20  Igor Zamyatin  

* c-array-notation.c (fix_builtin_array_notation_fn): Check invalid
function parameters.


diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
index 0ac6ba8..127f9a1 100644
--- a/gcc/c/c-array-notation.c
+++ b/gcc/c/c-array-notation.c
@@ -229,6 +229,8 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
/* Fully fold any EXCESSIVE_PRECISION EXPR that can occur in the function
   parameter.  */
func_parm = c_fully_fold (func_parm, false, NULL);
+  if (func_parm == error_mark_node)
+return error_mark_node;

location = EXPR_LOCATION (an_builtin_fn);



You should include a testcase.

Agreed.  This definitely should include a testcase.

jeff

Fix libgo build

2014-05-20 Thread Jan Hubicka

Hi,
libgo currently does not build because of duplicated symbols. This is because of
confusion in the output machinery, where comdat local is output as a global
symbol.  This is because of confusing use of DECL_ONE_ONLY. This bug seems to be
there since introduction of comdat locals in gcc 4.9, but it was not harmful, 
because
all symbols C++ produce have unique name.


Instead of one-liner to special case comdat locals in elfos.h, it seems better
to make DECL_ONE_ONLY more meaningful and make it return true only on real 
comdats,
not on comdat locals.
This patch does that and revisits its use in the back-end.  Basically all tests
that care about the fact if function should go into special section needs to
be DECL_COMDAT_GROUP (decl), because we care if function is in comdat group.
All tests that care about special semantic of comdats needs to be DECL_ONE_ONLY.

For those who are interested, the oneliner is:

Index: config/elfos.h
===
--- config/elfos.h  (revision 210623)
+++ config/elfos.h  (working copy)
@@ -303,6 +303,7 @@
 RTLD_LOCAL.  Don't use gnu_unique_object for typeinfo, \
 vtables and other read-only artificial decls.  */  \
   if (USE_GNU_UNIQUE_OBJECT && DECL_ONE_ONLY (DECL)
\
+ && TREE_PUBLIC (DECL) \
  && (!DECL_ARTIFICIAL (DECL) || !TREE_READONLY (DECL)))\
ASM_OUTPUT_TYPE_DIRECTIVE (FILE, NAME, "gnu_unique_object");\
   else \

Bootstrapped/regtested x86_64-linux, I will give it some further testing and 
commit
at afternoon if there are no complains.

Honza

PR go/61232
* tree.h (DECL_ONE_ONLY): Return true only for externally visible
symbols.
* except.c (switch_to_exception_section, resolve_unique_section,
get_named_text_section, default_function_rodata_section, 
align_variable, get_block_for_decl, default_section_type_flags):
Use DECL_COMDAT_GROUP instead of DECL_ONE_ONLY.
* symtab.c (symtab_add_to_same_comdat_group,
symtab_make_decl_local, fixup_same_cpp_alias_visibility,
symtab_nonoverwritable_alias, symtab_get_symbol_partitioning_class):
Likewise.
* cgraphclones.c (cgraph_create_virtual_clone): Likewise.
* bb-reorder.c (pass_partition_blocks::gate): Likewise.
* config/c6x/c6x.c (c6x_elf_unique_section): Likewise.
(c6x_function_in_section_p): Likewise.
* config/darwin.c (machopic_select_section): Likewise.
* config/arm/arm.c (arm_function_in_section_p): Likewise.
* config/mips/mips.c (mips_function_rodata_section): Likewise.
* config/mep/mep.c (mep_select_section): LIkewise.
* config/i386/i386.c (x86_64_elf_unique_section): Likewise.

Index: tree.h
===
--- tree.h  (revision 210521)
+++ tree.h  (working copy)
@@ -2327,7 +2327,8 @@ extern void decl_value_expr_insert (tree
 
 /* Used in TREE_PUBLIC decls to indicate that copies of this DECL in
multiple translation units should be merged.  */
-#define DECL_ONE_ONLY(NODE) (DECL_COMDAT_GROUP (NODE) != NULL_TREE)
+#define DECL_ONE_ONLY(NODE) (DECL_COMDAT_GROUP (NODE) != NULL_TREE \
+&& (TREE_PUBLIC (NODE) || DECL_EXTERNAL (NODE)))
 
 /* The name of the object as the assembler will see it (but before any
translations made by ASM_OUTPUT_LABELREF).  Often this is the same
Index: except.c
===
--- except.c(revision 210521)
+++ except.c(working copy)
@@ -2854,12 +2854,12 @@ switch_to_exception_section (const char
 
 #ifdef HAVE_LD_EH_GC_SECTIONS
  if (flag_function_sections
- || (DECL_ONE_ONLY (current_function_decl) && HAVE_COMDAT_GROUP))
+ || (DECL_COMDAT_GROUP (current_function_decl) && 
HAVE_COMDAT_GROUP))
{
  char *section_name = XNEWVEC (char, strlen (fnname) + 32);
  /* The EH table must match the code section, so only mark
 it linkonce if we have COMDAT groups to tie them together.  */
- if (DECL_ONE_ONLY (current_function_decl) && HAVE_COMDAT_GROUP)
+ if (DECL_COMDAT_GROUP (current_function_decl) && 
HAVE_COMDAT_GROUP)
flags |= SECTION_LINKONCE;
  sprintf (section_name, ".gcc_except_table.%s", fnname);
  s = get_section (section_name, flags, current_function_decl);
Index: varasm.c
===
--- varasm.c(revision 210521)
+++ varasm.c(working copy)
@@ -428,7 +428,7 @@ resolve_unique_section (tree decl, int r
   if (DECL_SECTION_NAME (decl) == NULL_TREE
   && targetm_common.have_named_sections
   && (flag_function_or_data

Re: [PATCH] Regression fix for PR target/61223

2014-05-20 Thread Ramana Radhakrishnan


On 05/20/14 17:05, Alexey Merzlyakov wrote:

Hi all,

This is a fix for thumb1 build fail on trunk appeared since
rev.210515(pr60758).

On thumb1 targets the LR can not be used as argument of POP instruction.
To keep a support of __cxa_end_cleanup backtracing on thumb1
we can make additional register operations before push/pop.
But I guess, this will have a negative effect on __cxa_end_cleanup
performance.
So, I propose to preserve __cxa_end_cleanup backtracing on thumb2
architectures and revert it on thumb1.

Regtest was finished with no regressions on C/C++ arm-linux-gnueabi(sf).


For now, please revert your original patch and one of Richard or me will 
try to look at it in the morning. The thumb1 case is slightly more 
complicated than this.


I don't like this approach as you are introducing the problem again in 
ARM state and leaving the problem as is for Thumb1.


regards
Ramana



Best regards,
Merzlyakov Alexey

Re: [C++ Patch/RFC] PR 58753 & 58930

2014-05-20 Thread Paolo Carlini


Hi,

On 05/20/2014 05:08 PM, Jason Merrill wrote:

On 05/20/2014 10:50 AM, Paolo Carlini wrote:

+  if (TREE_TYPE (init) != type)
+{
+  if (BRACE_ENCLOSED_INITIALIZER_P (init)
+  && CP_AGGREGATE_TYPE_P (type))
+init = reshape_init (type, init, tf_warning_or_error);
+  init = digest_init (type, init, tf_warning_or_error);


Instead of this let's factor the relevant code out of 
cp_parser_late_parse_one_default_arg and use it here as well.

Indeed. Thus I'm finishing testing the below.

Thanks,
Paolo.

///

/cp
2014-05-20  Paolo Carlini  

PR c++/58753
PR c++/58930
PR c++/58704
* typeck2.c (digest_nsdmi_init): New.
* parser.c (cp_parser_late_parse_one_default_arg): Use it.
* init.c (get_nsdmi): Likewise.
* cp-tree.h (digest_nsdmi_init): Declare.

/testsuite
2014-05-20  Paolo Carlini  

PR c++/58753
PR c++/58930
PR c++/58704
* g++.dg/cpp0x/nsdmi-template11.C: New.
* g++.dg/cpp0x/nsdmi-template12.C: Likewise.
* g++.dg/cpp0x/nsdmi-template13.C: Likewise.
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 210645)
+++ cp/cp-tree.h(working copy)
@@ -6172,6 +6172,7 @@ extern tree store_init_value  (tree, 
tree, vec
+struct BaseHandler
+{
+  SampleModule module_ { 0 };
+};
+
+BaseHandler a;
Index: testsuite/g++.dg/cpp0x/nsdmi-template12.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template12.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template12.C   (working copy)
@@ -0,0 +1,17 @@
+// PR c++/58753
+// { dg-do compile { target c++11 } }
+
+#include 
+
+template 
+struct X {X(std::initializer_list) {}};
+
+template  
+class T {
+  X x{1}; 
+}; 
+
+int main()
+{
+  T t;
+}
Index: testsuite/g++.dg/cpp0x/nsdmi-template13.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template13.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template13.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58704
+// { dg-do compile { target c++11 } }
+
+struct A {};
+
+template struct B
+{
+  A a[1] = { };
+};
+
+B b;

Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-20 Thread Alexey Merzlyakov



On 20.05.2014 17:16, Richard Earnshaw wrote:

On 20/05/14 14:12, Ramana Radhakrishnan wrote:

The following PR is opened for this problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61223

Thumb1 failure was also detected and reported in pr60758.
I've proposed a thumb1 bugfix there. Regtest for the fix currently is in
progress.

Patches must be proposed on gcc-patches and / or in this case

Never OR.  Patches must always go to gcc-patches.  They may *also* go to
other lists, if appropriate.

R.


libstdc++-patches mailing list. Attaching patches to bugzilla does not
ensure review.

Thanks,
Ramana


Best regards,
Merzlyakov Alexey


Thanks, created a thread with proposed fix:
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01640.html

Best regards,
Merzlyakov Alexey

[PATCH] Regression fix for PR target/61223

2014-05-20 Thread Alexey Merzlyakov


Hi all,

This is a fix for thumb1 build fail on trunk appeared since 
rev.210515(pr60758).


On thumb1 targets the LR can not be used as argument of POP instruction.
To keep a support of __cxa_end_cleanup backtracing on thumb1
we can make additional register operations before push/pop.
But I guess, this will have a negative effect on __cxa_end_cleanup 
performance.
So, I propose to preserve __cxa_end_cleanup backtracing on thumb2 
architectures and revert it on thumb1.


Regtest was finished with no regressions on C/C++ arm-linux-gnueabi(sf).

Best regards,
Merzlyakov Alexey

2014-05-20  Alexey Merzlyakov  

	PR target/61223
	* libsupc++/eh_arm.cc (__cxa_end_cleanup): Revert backtracing support
	on thumb1.

diff --git a/libstdc++-v3/libsupc++/eh_arm.cc b/libstdc++-v3/libsupc++/eh_arm.cc
index 6a45af5..8ad327d 100644
--- a/libstdc++-v3/libsupc++/eh_arm.cc
+++ b/libstdc++-v3/libsupc++/eh_arm.cc
@@ -200,7 +200,8 @@ asm (".global __cxa_end_cleanup\n"
 #else
 // Assembly wrapper to call __gnu_end_cleanup without clobbering r1-r3.
 // Also push lr to preserve stack alignment and to allow backtracing.
-#ifdef __thumb__
+// On thumb1 architectures push r4 because pop lr is not available.
+#ifdef __thumb2__
 asm ("  .pushsection .text.__cxa_end_cleanup\n"
 "	.global __cxa_end_cleanup\n"
 "	.type __cxa_end_cleanup, \"function\"\n"
@@ -214,6 +215,17 @@ asm ("  .pushsection .text.__cxa_end_cleanup\n"
 "	bl\t_Unwind_Resume @ Never returns\n"
 "	.fnend\n"
 "	.popsection\n");
+#elif defined(__thumb__)
+asm ("  .pushsection .text.__cxa_end_cleanup\n"
+"	.global __cxa_end_cleanup\n"
+"	.type __cxa_end_cleanup, \"function\"\n"
+"	.thumb_func\n"
+"__cxa_end_cleanup:\n"
+"	push\t{r1, r2, r3, r4}\n"
+"	bl\t__gnu_end_cleanup\n"
+"	pop\t{r1, r2, r3, r4}\n"
+"	bl\t_Unwind_Resume @ Never returns\n"
+"	.popsection\n");
 #else
 asm ("  .pushsection .text.__cxa_end_cleanup\n"
 "	.global __cxa_end_cleanup\n"

Re: [Google/4_8] LIPO COMDAT profile fixups

2014-05-20 Thread Xinliang David Li

On Tue, May 20, 2014 at 6:32 AM, Teresa Johnson  wrote:
> On Mon, May 19, 2014 at 11:51 PM, Xinliang David Li  
> wrote:
>> Why duplicating the merger functions in dyn-ipa.c? Should those in
>> libgcov-merge.c be reused?
>
> The merger functions in libgcov-merge.c use a macro to either read the
> counter to merge from a buffer in memory (when IN_GCOV_TOOL) or from
> the gcda file otherwise. In this case I need to merge from another
> array, and it needs to do so both IN_GCOV_TOOL and otherwise. So to
> reuse the merger functions I would have had to get the counter value
> from a new argument, guarded by a conditional. I didn't want to change
> the interface of those existing merge functions, or more importantly,
> make them less efficient with the added condition since they are
> invoked frequently during gcda file merging.
>

ok -- something to think about in the future.

For now, please add a comment in libgcov-utils.c before
ctr_merge_functions to point to the dup merge functions in dyn-ipa.c.

>>
>> The refactoring of gcov_exit_write_gcda should probably be done in a
>> separate patch -- preferably submitted to trunk too.
>
> The refactoring is necessary for this patch since I need to invoke the
> outlined functionality in gcov_dump_module_info as well.
>
> I could submit that to trunk, but it has to go in to the google
> branches either with or preceeding this patch.
>

ok for google branch after fixing the formatting issues

1) calller --> caller
2) many places where there is missing space: e.g,
gcov_fixup_counters_checksum( ..

David

> Thanks,
> Teresa
>
>>
>> David
>>
>> On Mon, May 19, 2014 at 10:08 PM, Teresa Johnson  
>> wrote:
>>> Ping.
>>> Teresa
>>>
>>> On Wed, May 14, 2014 at 4:39 PM, Teresa Johnson  
>>> wrote:
 This patch applies profile fixups to COMDATs on the dyn ipa callgraph
 at the end of LIPO module grouping (either in the profile gen run or
 in gcov-tool). This is to address issues with missing profiles in the
 out-of-line COMDAT copies not selected by the linker, and indirect
 call profiles that target a module not included in the module group.
 By default, both fixups are enabled, but can be controlled by a
 profile-gen parameter, an environment variable, and a gcov-tool
 option.

 The fixups assume that functions with the same lineno and cfg checksum
 are copies of the same COMDAT. This is made more likely by ensuring
 that in LIPO mode we include the full mangled name in the
 lineno_checksum.

 For the counter fixup, we merge all non-zero profiles with matching
 checksums and copy the merged profile into copies with all-zero
 profiles.

 For the indirect counter fixup, if an indirect call profile target is
 not in the module group, we look for a matching checksum copy in the
 primary module and if exactly one is found we change the target to
 that.

 If any fixups are applied, the gcda files are rewritten after module 
 grouping.

 This also required a couple of other changes to the optimizer. During
 cgraph node resolution, we were arbitrarily selecting a copy that had
 non-zero profiles. Now there are many more to choose from, so we will
 prefer the primary module copy if it is non-zero. Also, during cloning
 for inlining, we only want to update the profile on the callee node if
 we are inlining into the resolved node caller node. We were already
 doing this for AutoFDO, and need to do this here now that many node
 copies have the same profile.

 Patch attached. Tested with regression tests and internal benchmarks.
 Ok for google branches?

 Thanks,
 Teresa

 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
>>>
>>>
>>>
>>> --
>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[Patch, ARM] Fix pr 61208

2014-05-20 Thread Richard Earnshaw

PR 61208 is a wrong code bug in Thumb2 where we can generate out of
range branches due to incorrect instruction size calculations.  It's
mostly gone latent on 4.9 and trunk (though could still happen at -O0
where splitting is done during final instruction generation).

Complicating things slightly is the fact that prior to 4.9 we had a
single alternative that was emitted statically; we now use an
insn_and_split (hence the reduced impact on 4.9).  That means different
patches are needed: one for 4.9 and trunk; another for 4.8 and earlier.

I've tried to minimize the 4.8 version as much as possible to reduce the
risk.  It seems simplest to just provide a new alternative for use with
Thumb2 that has the correct length.  Consequently there are two versions
of this patch attached.  One for 4.7 and 4.8, the other for 4.9 and trunk.

PR target/61208
* arm.md (arm_cmpdi_unsigned): Fix length calculation for Thumb2.

Testing of the backport version on 4.8 is OK, the trunk version is
ongoing, though I've smoke tested it.

Richi, is the backport version OK to go into 4.8?


R.
Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 210617)
+++ gcc/config/arm/arm.md   (working copy)
@@ -8371,8 +8371,8 @@ (define_insn_and_split "*arm_cmpdi_insn"
 
 (define_insn_and_split "*arm_cmpdi_unsigned"
   [(set (reg:CC_CZ CC_REGNUM)
-(compare:CC_CZ (match_operand:DI 0 "s_register_operand" "l,r,r")
-   (match_operand:DI 1 "arm_di_operand" "Py,r,rDi")))]
+(compare:CC_CZ (match_operand:DI 0 "s_register_operand" "l,r,r,r")
+   (match_operand:DI 1 "arm_di_operand" 
"Py,r,Di,rDi")))]
 
   "TARGET_32BIT"
   "#"   ; "cmp\\t%R0, %R1\;it eq\;cmpeq\\t%Q0, %Q1"
@@ -8392,9 +8392,9 @@ (define_insn_and_split "*arm_cmpdi_unsig
 operands[1] = gen_lowpart (SImode, operands[1]);
   }
   [(set_attr "conds" "set")
-   (set_attr "enabled_for_depr_it" "yes,yes,no")
-   (set_attr "arch" "t2,t2,*")
-   (set_attr "length" "6,6,8")
+   (set_attr "enabled_for_depr_it" "yes,yes,no,*")
+   (set_attr "arch" "t2,t2,t2,a")
+   (set_attr "length" "6,6,10,8")
(set_attr "type" "multiple")]
 )
 --- gcc/config/arm/arm.md   (revision 210631)
+++ gcc/config/arm/arm.md   (local)
@@ -7630,12 +7630,13 @@ (define_insn "*arm_cmpdi_insn"
 
 (define_insn "*arm_cmpdi_unsigned"
   [(set (reg:CC_CZ CC_REGNUM)
-   (compare:CC_CZ (match_operand:DI 0 "s_register_operand" "r")
-  (match_operand:DI 1 "arm_di_operand" "rDi")))]
+   (compare:CC_CZ (match_operand:DI 0 "s_register_operand" "r,r")
+  (match_operand:DI 1 "arm_di_operand" "rDi,rDi")))]
   "TARGET_32BIT"
   "cmp\\t%R0, %R1\;it eq\;cmpeq\\t%Q0, %Q1"
   [(set_attr "conds" "set")
-   (set_attr "length" "8")]
+   (set_attr "arch" "a,t2")
+   (set_attr "length" "8,10")]
 )
 
 (define_insn "*arm_cmpdi_zero"

Re: [PATCH, PR61191, Cilk+] Fix ICE on syntax error

2014-05-20 Thread H.J. Lu

On Tue, May 20, 2014 at 7:31 AM, Zamyatin, Igor  wrote:
> Hi all!
>
> The following patch fixes the ICE for the cilk code with syntax error.
>
> Regtested on x86_64.
> Ok for trunk and 4.9?
>
> Thanks,
> Igor
>
>
> gcc/c/ChangeLog:
>
> 2014-05-20  Igor Zamyatin  
>
> * c-array-notation.c (fix_builtin_array_notation_fn): Check invalid
> function parameters.
>
>
> diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
> index 0ac6ba8..127f9a1 100644
> --- a/gcc/c/c-array-notation.c
> +++ b/gcc/c/c-array-notation.c
> @@ -229,6 +229,8 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var)
>/* Fully fold any EXCESSIVE_PRECISION EXPR that can occur in the function
>   parameter.  */
>func_parm = c_fully_fold (func_parm, false, NULL);
> +  if (func_parm == error_mark_node)
> +return error_mark_node;
>
>location = EXPR_LOCATION (an_builtin_fn);
>

You should include a testcase.


-- 
H.J.

Re: [build, doc, testsuite] Centralise clearing hardware capabilities with Sun ld

2014-05-20 Thread Mike Stump

On May 20, 2014, at 6:03 AM, Rainer Orth  wrote:
> The following patch implements what I've outlined there

> Ok for mainline and 4.9 branch if those pass?

For the test suite bits, Ok.

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-20 Thread H.J. Lu

On Tue, May 20, 2014 at 5:00 AM, Kirill Yukhin  wrote:
> Hello,
> On 19 May 09:58, H.J. Lu wrote:
>> On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak  wrote:
>> > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu  wrote:
>> >
>>  Uros,
>>  I am looking into libreoffice size and the data alignment seems to make 
>>  huge
>>  difference. Data section has grown from 5.8MB to 6.3MB in between GCC 
>>  4.8 and 4.9,
>>  while clang produces 5.2MB.
>> 
>>  The two patches I posted to not align vtables and RTTI reduces it to 
>>  5.7MB, but
>>  But perhaps we want to revisit the alignment rules.  The optimization 
>>  manuals
>>  usually care only about performance critical loops.  Perhaps we can 
>>  make the
>>  rules to align only bigger datastructures, or so at least for -O2.
>> >>>
>> >>> Based on the above quote, "Misaligned data access can incur
>> >>> significant performance penalties." and the fact that this particular
>> >>> alignment rule has some compatibility issues with previous versions of
>> >>> gcc (these were later fixed by Jakub), I'd rather leave this rule as
>> >>> is. However, if the access is from the cold section, we can perhaps
>> >>> avoid extra alignment, while avoiding those compatibility issues.
>> >>>
>> >>
>> >> It is excessive to align
>> >>
>> >> struct foo
>> >> {
>> >>   int x1;
>> >>   int x2;
>> >>   char x3;
>> >>   int x4;
>> >>   int x5;
>> >>   char x6;
>> >>   int x7;
>> >>   int x8;
>> >> };
>> >>
>> >> to 32 bytes and align
>> >>
>> >> struct foo
>> >> {
>> >>   int x1;
>> >>   int x2;
>> >>   char x3;
>> >>   int x4;
>> >>   int x5;
>> >>   char x6;
>> >>   int x7[9];
>> >>   int x8;
>> >> };
>> >>
>> >> to 64 bytes.  What performance gain does it provide?
>> >
>> > Avoids "significant performance penalties," perhaps?
>> >
>>
>> Kirill, do we have performance data for excessive alignment
>> vs ABI alignment?
> Nope, we have no actual data showing performance impact on such changes,
> sorry.
>
> We may try such a change on HSW machine (on Spec 2006), will it be useful?
>
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -26576,7 +26576,7 @@ ix86_data_alignment (tree type, int align, bool opt)
>   used to assume.  */
>
>int max_align_compat
> -= optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
> += optimize_size ? BITS_PER_WORD : MIN (128, MAX_OFILE_ALIGNMENT);
>
>/* A data structure, equal or greater than the size of a cache line
>   (64 bytes in the Pentium 4 and other recent Intel processors, including
>
>

ABI alignment should be sufficient for correctness. Bigger alignments
are supposed to give better performance.  Can you try this patch on
HSW and SLM to see if it has any impact on performance?

-- 
H.J.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c0a46ed..4879110 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -26568,39 +26568,6 @@ ix86_constant_alignment (tree exp, int align)
 int
 ix86_data_alignment (tree type, int align, bool opt)
 {
-  /* GCC 4.8 and earlier used to incorrectly assume this alignment even
- for symbols from other compilation units or symbols that don't need
- to bind locally.  In order to preserve some ABI compatibility with
- those compilers, ensure we don't decrease alignment from what we
- used to assume.  */
-
-  int max_align_compat
-= optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
-
-  /* A data structure, equal or greater than the size of a cache line
- (64 bytes in the Pentium 4 and other recent Intel processors, including
- processors based on Intel Core microarchitecture) should be aligned
- so that its base address is a multiple of a cache line size.  */
-
-  int max_align
-= MIN ((unsigned) ix86_tune_cost->prefetch_block * 8, MAX_OFILE_ALIGNMENT);
-
-  if (max_align < BITS_PER_WORD)
-max_align = BITS_PER_WORD;
-
-  if (opt
-  && AGGREGATE_TYPE_P (type)
-  && TYPE_SIZE (type)
-  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
-{
-  if (wi::geu_p (TYPE_SIZE (type), max_align_compat)
-  && align < max_align_compat)
- align = max_align_compat;
-   if (wi::geu_p (TYPE_SIZE (type), max_align)
-   && align < max_align)
- align = max_align;
-}
-
   /* x86-64 ABI requires arrays greater than 16 bytes to be aligned
  to 16byte boundary.  */
   if (TARGET_64BIT)
@@ -26616,6 +26583,9 @@ ix86_data_alignment (tree type, int align, bool opt)
   if (!opt)
 return align;

+  if (align < BITS_PER_WORD)
+align = BITS_PER_WORD;
+
   if (TREE_CODE (type) == ARRAY_TYPE)
 {
   if (TYPE_MODE (TREE_TYPE (type)) == DFmode && align < 64)

Re: [RFC] HOST_WIDE_INT transition steps

2014-05-20 Thread Joseph S. Myers

On Tue, 20 May 2014, Eric Botcazou wrote:

> > Make the code base easier to understand for newcomers.  It's also a
> > documentation improvement (you see what a HOST_WIDE_INT really is),
> > alongside with [u]int64_t being less to type ...
> 
> I personally find the abstraction and the separation with the other, more 
> mundane types useful, but I guess that it's a matter of habit.

Whatever the abstraction is, HOST_WIDE_INT is a pretty poor name for it.  
I think it's something like "integer type that is big enough to store a 
target address / offset in bytes, or a target word", together with a sense 
of "integer type big enough for certain constants and such that twice it 
is big enough for other constants" (the latter being what wide-int should 
enable eliminating eventually).

But I suspect HOST_WIDE_INT is used for lots of other things as well - and 
for things for which 64 bits may not be enough (bit offsets, for example), 
and for that matter int is used for things for which HOST_WIDE_INT would 
be more appropriate (the length of a STRING_CST, for example).  So while 
there may be a meaningful abstraction somewhere in there, I don't think 
the particular places that happen to use HOST_WIDE_INT are a very good 
guide to where such an abstraction should be used - and don't think moving 
to int64_t / uint64_t would actually lose information of use for cleaning 
things up in future.

(HOST_WIDEST_INT should of course become intmax_t.  It should also not be 
used in any way that can affect code generation.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C++ Patch/RFC] PR 58753 & 58930

2014-05-20 Thread Jason Merrill


On 05/20/2014 10:50 AM, Paolo Carlini wrote:

+  if (TREE_TYPE (init) != type)
+   {
+ if (BRACE_ENCLOSED_INITIALIZER_P (init)
+ && CP_AGGREGATE_TYPE_P (type))
+   init = reshape_init (type, init, tf_warning_or_error);
+ init = digest_init (type, init, tf_warning_or_error);


Instead of this let's factor the relevant code out of 
cp_parser_late_parse_one_default_arg and use it here as well.  That is,



  int flags = LOOKUP_IMPLICIT;
  if (DIRECT_LIST_INIT_P (parsed_arg))
flags = LOOKUP_NORMAL;
  parsed_arg = digest_init_flags (TREE_TYPE (decl), parsed_arg, flags);
  if (TREE_CODE (parsed_arg) == TARGET_EXPR)
/* This represents the whole initialization.  */
TARGET_EXPR_DIRECT_INIT_P (parsed_arg) = true;


Jason

Re: [patch] libstdc++/61011 fix --disable-libstdcxx in top-level configure

2014-05-20 Thread Jonathan Wakely


On 20/05/14 15:53 +0100, Jonathan Wakely wrote:

The first part of this patch fixes the PR, we should be adding
target-libstdc++-v3 to noconfigdirs, not libstdc++-v3.

The second part disables the target libs that can't be bootstrapped
without libstdc++.

libcilkrts and libsanitizer use lots of std:: stuff.
libitm only uses std::atomic, but in any case fails to build when
--disable-libstdcxx is used, with this error:

xg++: error: unrecognized command line option ‘-funconfigured-libstdc++-v3’

Bootstrapped with and without --disable-libstdcxx and tested on
x86_64-linux.

PR libstdc++/61011
* configure.ac (--disable-libstdcxx): Set noconfigdirs correctly.
Disable libcilkrts, libitm, libsanitizer when not building libstdc++.
* configure: Regenerate.

OK for trunk?


With the patch this time ...
commit c384fef85bca43046b65eb02267169a5613af2b8
Author: Jonathan Wakely 
Date:   Tue May 20 12:13:43 2014 +0100

	PR libstdc++/61011
	* configure.ac (--disable-libstdcxx): Set noconfigdirs correctly.
	Disable libcilkrts, libitm, libsanitizer when not building libstdc++.
	* configure: Regenerate.

diff --git a/configure.ac b/configure.ac
index 07c3a66..548525b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -435,7 +435,7 @@ AS_HELP_STRING([--disable-libstdcxx],
 ENABLE_LIBSTDCXX=$enableval,
 ENABLE_LIBSTDCXX=default)
 [if test "${ENABLE_LIBSTDCXX}" = "no" ; then
-  noconfigdirs="$noconfigdirs libstdc++-v3"
+  noconfigdirs="$noconfigdirs target-libstdc++-v3"
 fi]
 
 # Save it here so that, even in case of --enable-libgcj, if the Java
@@ -2057,9 +2057,17 @@ case ,${enable_languages},:${enable_objc_gc} in
 ;;
 esac
 
-# Disable libitm, libsanitizer, libvtv if we're not building C++
+# Disable libcilkrts, libitm, libsanitizer, libvtv if we're not building C++
 case ,${enable_languages}, in
-  *,c++,*) ;;
+  *,c++,*)
+# Disable libcilkrts, libitm, libsanitizer if we're not building libstdc++
+case "${noconfigdirs}" in
+  *target-libstdc++-v3*)
+noconfigdirs="$noconfigdirs target-libcilkrts target-libitm target-libsanitizer"
+;;
+  *) ;;
+esac
+;;
   *)
 noconfigdirs="$noconfigdirs target-libcilkrts target-libitm target-libsanitizer target-libvtv"
 ;;

Re: [patch] libstdc++/61011 fix --disable-libstdcxx in top-level configure

2014-05-20 Thread Jakub Jelinek

On Tue, May 20, 2014 at 03:53:14PM +0100, Jonathan Wakely wrote:
> The first part of this patch fixes the PR, we should be adding
> target-libstdc++-v3 to noconfigdirs, not libstdc++-v3.
> 
> The second part disables the target libs that can't be bootstrapped
> without libstdc++.
> 
> libcilkrts and libsanitizer use lots of std:: stuff.
> libitm only uses std::atomic, but in any case fails to build when
> --disable-libstdcxx is used, with this error:
> 
> xg++: error: unrecognized command line option ‘-funconfigured-libstdc++-v3’
> 
> Bootstrapped with and without --disable-libstdcxx and tested on
> x86_64-linux.
> 
>   PR libstdc++/61011
>   * configure.ac (--disable-libstdcxx): Set noconfigdirs correctly.
>   Disable libcilkrts, libitm, libsanitizer when not building libstdc++.
>   * configure: Regenerate.
> 
> OK for trunk?

ENOPATCH.

Jakub

[patch] libstdc++/61011 fix --disable-libstdcxx in top-level configure

2014-05-20 Thread Jonathan Wakely


The first part of this patch fixes the PR, we should be adding
target-libstdc++-v3 to noconfigdirs, not libstdc++-v3.

The second part disables the target libs that can't be bootstrapped
without libstdc++.

libcilkrts and libsanitizer use lots of std:: stuff.
libitm only uses std::atomic, but in any case fails to build when
--disable-libstdcxx is used, with this error:

xg++: error: unrecognized command line option ‘-funconfigured-libstdc++-v3’

Bootstrapped with and without --disable-libstdcxx and tested on
x86_64-linux.

PR libstdc++/61011
* configure.ac (--disable-libstdcxx): Set noconfigdirs correctly.
Disable libcilkrts, libitm, libsanitizer when not building libstdc++.
* configure: Regenerate.

OK for trunk?

Re: [C++ Patch/RFC] PR 58753 & 58930

2014-05-20 Thread Paolo Carlini


Hi,

On 05/19/2014 08:28 PM, Jason Merrill wrote:
How about doing digest_init in get_nsdmi, so that the conversion is 
also exposed to walk_field_subobs?
Thus, good news: something as simple as the below passes testing, works 
for the 2 bugs and for c++/58704 too. Thus, what else? Personally, I'm 
not sure about a few things: whether reshape_init is needed (I tend to 
think so, because later in perform_member_init certainly we use it); 
whether digest_init (vs digest_init_flags, cmp the parser) is fine.


Thanks again!
Paolo.

//
Index: cp/init.c
===
--- cp/init.c   (revision 210641)
+++ cp/init.c   (working copy)
@@ -529,17 +529,28 @@ tree
 get_nsdmi (tree member, bool in_ctor)
 {
   tree init;
+  tree type = TREE_TYPE (member);
   tree save_ccp = current_class_ptr;
   tree save_ccr = current_class_ref;
   if (!in_ctor)
 inject_this_parameter (DECL_CONTEXT (member), TYPE_UNQUALIFIED);
   if (DECL_LANG_SPECIFIC (member) && DECL_TEMPLATE_INFO (member))
-/* Do deferred instantiation of the NSDMI.  */
-init = (tsubst_copy_and_build
-   (DECL_INITIAL (DECL_TI_TEMPLATE (member)),
-DECL_TI_ARGS (member),
-tf_warning_or_error, member, /*function_p=*/false,
-/*integral_constant_expression_p=*/false));
+{
+  /* Do deferred instantiation of the NSDMI.  */
+  init = (tsubst_copy_and_build
+ (DECL_INITIAL (DECL_TI_TEMPLATE (member)),
+  DECL_TI_ARGS (member),
+  tf_warning_or_error, member, /*function_p=*/false,
+  /*integral_constant_expression_p=*/false));
+
+  if (TREE_TYPE (init) != type)
+   {
+ if (BRACE_ENCLOSED_INITIALIZER_P (init)
+ && CP_AGGREGATE_TYPE_P (type))
+   init = reshape_init (type, init, tf_warning_or_error);
+ init = digest_init (type, init, tf_warning_or_error);
+   }
+}
   else
 {
   init = DECL_INITIAL (member);
Index: testsuite/g++.dg/cpp0x/nsdmi-template11.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template11.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template11.C   (working copy)
@@ -0,0 +1,15 @@
+// PR c++/58930
+// { dg-do compile { target c++11 } }
+
+struct SampleModule
+{
+  explicit SampleModule (int);
+};
+
+template < typename >
+struct BaseHandler
+{
+  SampleModule module_ { 0 };
+};
+
+BaseHandler a;
Index: testsuite/g++.dg/cpp0x/nsdmi-template12.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template12.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template12.C   (working copy)
@@ -0,0 +1,17 @@
+// PR c++/58753
+// { dg-do compile { target c++11 } }
+
+#include 
+
+template 
+struct X {X(std::initializer_list) {}};
+
+template  
+class T {
+  X x{1}; 
+}; 
+
+int main()
+{
+  T t;
+}
Index: testsuite/g++.dg/cpp0x/nsdmi-template13.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template13.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template13.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58704
+// { dg-do compile { target c++11 } }
+
+struct A {};
+
+template struct B
+{
+  A a[1] = { };
+};
+
+B b;

Re: Ping2: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration.

2014-05-20 Thread Jason Merrill


On 05/20/2014 02:55 AM, Mark Wielaard wrote:

On Mon, May 19, 2014 at 04:50:35PM -0400, Jason Merrill wrote:

On 05/13/2014 03:21 AM, Mark Wielaard wrote:

So the debugger doesn't have to guess the properties of the enum's
underlying base type, like size, encoding and signedness.


Well, the enum already has DW_AT_byte_size.  It seems to me that it should
also have DW_AT_encoding to provide the other two properties you mention.


Right, that is the idea, since an enum doesn't provide those attributes,
it should have a reference to the underlying base type that provides them.


Yes, but why is that better than providing them directly?  The latter 
would seem to work better with non-C-family languages like Ada.


Jason

Re: RFA: cache enabled attribute by insn code

2014-05-20 Thread Mike Stump

On May 20, 2014, at 1:17 AM, Richard Sandiford  
wrote:
>> The patch gives a consistent compile-time improvement of about ~3.5%
>> on the -O0 fold-const.ii test case.

3.5 alone is really nice and 3.5 on top of 3.5 is amazing.

Re: patch to fix PR60969

2014-05-20 Thread Vladimir Makarov

On 05/19/2014 05:37 PM, James Greenhalgh wrote:
> On Fri, May 16, 2014 at 06:49:45PM +0100, Vladimir Makarov wrote:
>>The following patch fixes
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60969
>>
>> The patch was bootstrapped and tested on x86/x86-64.
>>
>> Committed as rev. 210519 to gcc 4.9 branch and as rev. 210520 to trunk.
>>
>> 2014-05-16  Vladimir Makarov  
>>
>>  PR rtl-optimization/60969
>>  * ira-costs.c (record_reg_classes): Allow only memory for pseudo.
>>  Calculate costs for this case.
>>
>> 2014-05-16  Vladimir Makarov  
>>
>>  PR rtl-optimization/60969
>>  * g++.dg/pr60969.C: New.
> This seems to have cause gcc.target/aarch64/vect-abs-compile.c to begin
> failing on aarch64-none-elf:
>
> FAIL: gcc.target/aarch64/table-intrinsics.c (internal compiler error)
> FAIL: gcc.target/aarch64/table-intrinsics.c (test for excess errors)
> Excess errors:
> /work/gcc-clean/src/gcc/gcc/testsuite/gcc.target/aarch64/table-intrinsics.c:172:1:
>  internal compiler error: Max. number of generated reload insns per insn is 
> achieved (90)
> 0x8923cd lra_constraints(bool)
>   /work/gcc-clean/src/gcc/gcc/lra-constraints.c:4140
> 0x882f62 lra(_IO_FILE*)
>   /work/gcc-clean/src/gcc/gcc/lra.c:2353
> 0x8453f6 do_reload
>   /work/gcc-clean/src/gcc/gcc/ira.c:5457
> 0x8453f6 execute
>   /work/gcc-clean/src/gcc/gcc/ira.c:5618
>
>
Sorry, I have no aarch64 machine.  Could you sent me the pre-processed
file of the test.

[PATCH, PR61191, Cilk+] Fix ICE on syntax error

2014-05-20 Thread Zamyatin, Igor

Hi all!

The following patch fixes the ICE for the cilk code with syntax error.

Regtested on x86_64.
Ok for trunk and 4.9?

Thanks,
Igor


gcc/c/ChangeLog:

2014-05-20  Igor Zamyatin  

* c-array-notation.c (fix_builtin_array_notation_fn): Check invalid  
function parameters.


diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
index 0ac6ba8..127f9a1 100644
--- a/gcc/c/c-array-notation.c
+++ b/gcc/c/c-array-notation.c
@@ -229,6 +229,8 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   /* Fully fold any EXCESSIVE_PRECISION EXPR that can occur in the function
  parameter.  */
   func_parm = c_fully_fold (func_parm, false, NULL);
+  if (func_parm == error_mark_node)
+    return error_mark_node;
   
   location = EXPR_LOCATION (an_builtin_fn);

[PATCH, ira] Fix for PR61241

2014-05-20 Thread 马江

Hi,all

There are some small problems that block the ira to make the best
choice, as PR61241 shows.
This patch fix PR61241. I have done "make bootstrap" and "make check"
on x86, nothing changed after the patch.
Is the patch OK for trunk?

Thanks
Ma Jiang

2014-05-20  Ma Jiang  

PR rtl-opti/61221
* ira-lives.c (process_bb_node_lives): Clear unnecessary
hard register conflicts.


irabug.patch
Description: Binary data

RE: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect Cilk_sync usage

2014-05-20 Thread Zamyatin, Igor

Please look then on the following patch.

Regtested successfully on x86_64. 
Is it ok for trunk and 4.9?

gcc/cp/ChangeLog:

2014-05-20  Igor Zamyatin  

PR c/60189
* parser.c (cp_parser_postfix_expression): Move handling of cilk_sync
from here to...
(cp_parser_statement): ...here. Make sure only semicolon can go after
Cilk_sync.

gcc/testsuite/ChangeLog:

2014-05-20  Igor Zamyatin  

PR c++/60189
* c-c++-common/cilk-plus/CK/invalid_sync.cс: New test.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0c9e113..814f323 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -5845,20 +5845,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
  }
break;
   }
-  
-case RID_CILK_SYNC:
-  if (flag_cilkplus)
-   { 
- tree sync_expr = build_cilk_sync ();
- SET_EXPR_LOCATION (sync_expr, 
-cp_lexer_peek_token (parser->lexer)->location);
- finish_expr_stmt (sync_expr);
-   }
-  else
-   error_at (token->location, "-fcilkplus must be enabled to use" 
- " %<_Cilk_sync%>");
-  cp_lexer_consume_token (parser->lexer);
-  break;
 
 case RID_BUILTIN_SHUFFLE:
   {
@@ -9414,6 +9400,24 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
  statement = cp_parser_jump_statement (parser);
  break;
 
+   case RID_CILK_SYNC:
+ cp_lexer_consume_token (parser->lexer);
+ if (flag_cilkplus)
+   {
+ tree sync_expr = build_cilk_sync ();
+ SET_EXPR_LOCATION (sync_expr,
+token->location);
+ statement = finish_expr_stmt (sync_expr);
+   }
+ else
+   {
+ error_at (token->location, "-fcilkplus must be enabled to use"
+   " %<_Cilk_sync%>");
+ statement = error_mark_node;
+   }
+ cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+ break;
+
  /* Objective-C++ exception-handling constructs.  */
case RID_AT_TRY:
case RID_AT_CATCH:
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/invalid_sync.cc 
b/gcc/testsuite/c-c++-common/cilk-plus/CK/invalid_sync.cc
new file mode 100644
index 000..cf1caf1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/invalid_sync.cc
@@ -0,0 +1,9 @@
+/* PR c/60189 */
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+
+int main (void)
+{
+_Cilk_sync return; /* { dg-error " expected ';' before 'return'" } */
+return 0;
+}

Thanks,
Igor

> -Original Message-
> From: Zamyatin, Igor
> Sent: Tuesday, May 13, 2014 12:28 AM
> To: 'Jason Merrill'; 'Jakub Jelinek'
> Cc: 'GCC Patches (gcc-patches@gcc.gnu.org)'; Iyer, Balaji V
> Subject: RE: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect Cilk_sync 
> usage
> 
> Ping. Should I prepare the patch?
> 
> Thanks,
> Igor
> 
> > > -Original Message
> > > From: Jason Merrill [mailto:ja...@redhat.com]
> > > Sent: Monday, April 14, 2014 9:49 PM
> > > To: Zamyatin, Igor; Jakub Jelinek
> > > Cc: GCC Patches (gcc-patches@gcc.gnu.org); Iyer, Balaji V
> > > Subject: Re: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect
> > > Cilk_sync usage
> > >
> > > Oh, I see where the problem is coming from.  Cilk_sync is a
> > > statement, but it's being parsed as an expression.  Let's move it to
> > cp_parser_statement.
> >
> > Something like this (better to put new code in separate routine?)?
> >
> > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index bb59e3b..3105d6c
> > 100644
> > --- a/gcc/cp/parser.c
> > +++ b/gcc/cp/parser.c
> > @@ -5835,20 +5835,6 @@ cp_parser_postfix_expression (cp_parser
> > *parser, bool address_p, bool cast_p,
> >   }
> > break;
> >}
> > -
> > -case RID_CILK_SYNC:
> > -  if (flag_cilkplus)
> > -   {
> > - tree sync_expr = build_cilk_sync ();
> > - SET_EXPR_LOCATION (sync_expr,
> > -cp_lexer_peek_token (parser->lexer)->location);
> > - finish_expr_stmt (sync_expr);
> > -   }
> > -  else
> > -   error_at (token->location, "-fcilkplus must be enabled to use"
> > - " %<_Cilk_sync%>");
> > -  cp_lexer_consume_token (parser->lexer);
> > -  break;
> >
> >  case RID_BUILTIN_SHUFFLE:
> >{
> > @@ -9404,6 +9390,24 @@ cp_parser_statement (cp_parser* parser, tree
> > in_statement_expr,
> >   statement = cp_parser_jump_statement (parser);
> >   break;
> >
> > +   case RID_CILK_SYNC:
> > + cp_lexer_consume_token (parser->lexer);
> > + if (flag_cilkplus)
> > +   {
> > + tree sync_expr = build_cilk_sync ();
> > + SET_EXPR_LOCATION (sync_expr,
> > +token->location);
> > + statement = finish_expr_stmt (sync_expr);
> > +   }
> > + else
> > +   {
> > + error_at (

Re: [C++ Patch] PR 60373

2014-05-20 Thread Jason Merrill


OK.

Jason

Re: PR 61210: recursive template substitution depending on system compiler

2014-05-20 Thread Jason Merrill


OK.

Jason

[PATCH] Tidy SCCVN dump

2014-05-20 Thread Richard Biener


This makes the dumps easier to follow.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-05-20  Richard Biener  

* tree-ssa-sccvn.c (process_scc): Dump SCC here, when
iterating,
(extract_and_process_scc_for_name): not here.
(cond_dom_walker::before_dom_children): Only process
stmts that end the BB in interesting ways.
(run_scc_vn): Mark param uses as visited.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 210635)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3833,6 +3833,9 @@ process_scc (vec scc)
}
 }
 
+  if (dump_file && (dump_flags & TDF_DETAILS))
+print_scc (dump_file, scc);
+
   /* Iterate over the SCC with the optimistic table until it stops
  changing.  */
   current_info = optimistic_info;
@@ -3858,6 +3861,8 @@ process_scc (vec scc)
changed |= visit_use (var);
 }
 
+  if (dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "Processing SCC needed %d iterations\n", iterations);
   statistics_histogram_event (cfun, "SCC iterations", iterations);
 
   /* Finally, copy the contents of the no longer used optimistic
@@ -3909,9 +3914,6 @@ extract_and_process_scc_for_name (tree n
   if (scc.length () > 1)
 sort_scc (scc);
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
-print_scc (dump_file, scc);
-
   process_scc (scc);
 
   return true;
@@ -4221,6 +4223,19 @@ cond_dom_walker::before_dom_children (ba
   if (!stmt)
 return;
 
+  enum gimple_code code = gimple_code (stmt);
+  if (code != GIMPLE_COND
+  && code != GIMPLE_SWITCH
+  && code != GIMPLE_GOTO)
+return;
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "Value-numbering operands of stmt ending BB %d: ",
+  bb->index);
+  print_gimple_stmt (dump_file, stmt, 0, 0);
+}
+
   /* Value-number the last stmts SSA uses.  */
   ssa_op_iter i;
   tree op;
@@ -4236,7 +4251,7 @@ cond_dom_walker::before_dom_children (ba
  if value-numbering can prove they are not reachable.  Handling
  computed gotos is also possible.  */
   tree val;
-  switch (gimple_code (stmt))
+  switch (code)
 {
 case GIMPLE_COND:
   {
@@ -4259,8 +4274,7 @@ cond_dom_walker::before_dom_children (ba
   val = gimple_goto_dest (stmt);
   break;
 default:
-  val = NULL_TREE;
-  break;
+  gcc_unreachable ();
 }
   if (!val)
 return;
@@ -4300,7 +4314,10 @@ run_scc_vn (vn_lookup_kind default_vn_wa
 {
   tree def = ssa_default_def (cfun, param);
   if (def)
-   VN_INFO (def)->valnum = def;
+   {
+ VN_INFO (def)->visited = true;
+ VN_INFO (def)->valnum = def;
+   }
 }
 
   /* Mark all edges as possibly executable.  */

Re: [Google/4_8] LIPO COMDAT profile fixups

2014-05-20 Thread Teresa Johnson

On Mon, May 19, 2014 at 11:51 PM, Xinliang David Li  wrote:
> Why duplicating the merger functions in dyn-ipa.c? Should those in
> libgcov-merge.c be reused?

The merger functions in libgcov-merge.c use a macro to either read the
counter to merge from a buffer in memory (when IN_GCOV_TOOL) or from
the gcda file otherwise. In this case I need to merge from another
array, and it needs to do so both IN_GCOV_TOOL and otherwise. So to
reuse the merger functions I would have had to get the counter value
from a new argument, guarded by a conditional. I didn't want to change
the interface of those existing merge functions, or more importantly,
make them less efficient with the added condition since they are
invoked frequently during gcda file merging.

>
> The refactoring of gcov_exit_write_gcda should probably be done in a
> separate patch -- preferably submitted to trunk too.

The refactoring is necessary for this patch since I need to invoke the
outlined functionality in gcov_dump_module_info as well.

I could submit that to trunk, but it has to go in to the google
branches either with or preceeding this patch.

Thanks,
Teresa

>
> David
>
> On Mon, May 19, 2014 at 10:08 PM, Teresa Johnson  wrote:
>> Ping.
>> Teresa
>>
>> On Wed, May 14, 2014 at 4:39 PM, Teresa Johnson  wrote:
>>> This patch applies profile fixups to COMDATs on the dyn ipa callgraph
>>> at the end of LIPO module grouping (either in the profile gen run or
>>> in gcov-tool). This is to address issues with missing profiles in the
>>> out-of-line COMDAT copies not selected by the linker, and indirect
>>> call profiles that target a module not included in the module group.
>>> By default, both fixups are enabled, but can be controlled by a
>>> profile-gen parameter, an environment variable, and a gcov-tool
>>> option.
>>>
>>> The fixups assume that functions with the same lineno and cfg checksum
>>> are copies of the same COMDAT. This is made more likely by ensuring
>>> that in LIPO mode we include the full mangled name in the
>>> lineno_checksum.
>>>
>>> For the counter fixup, we merge all non-zero profiles with matching
>>> checksums and copy the merged profile into copies with all-zero
>>> profiles.
>>>
>>> For the indirect counter fixup, if an indirect call profile target is
>>> not in the module group, we look for a matching checksum copy in the
>>> primary module and if exactly one is found we change the target to
>>> that.
>>>
>>> If any fixups are applied, the gcda files are rewritten after module 
>>> grouping.
>>>
>>> This also required a couple of other changes to the optimizer. During
>>> cgraph node resolution, we were arbitrarily selecting a copy that had
>>> non-zero profiles. Now there are many more to choose from, so we will
>>> prefer the primary module copy if it is non-zero. Also, during cloning
>>> for inlining, we only want to update the profile on the callee node if
>>> we are inlining into the resolved node caller node. We were already
>>> doing this for AutoFDO, and need to do this here now that many node
>>> copies have the same profile.
>>>
>>> Patch attached. Tested with regression tests and internal benchmarks.
>>> Ok for google branches?
>>>
>>> Thanks,
>>> Teresa
>>>
>>>
>>>
>>> --
>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[Ping] [C++ Patch] PR 60373

2014-05-20 Thread Paolo Carlini


Hi,

On 05/13/2014 02:19 AM, Paolo Carlini wrote:

Hi,

in this issue Marc noticed that the warning for ignored attribute 
visibility is truncated when the previous declaration belongs to a 
system header. Luckily the problem can be neatly fixed because it's 
just a consequence of not using warning_at + inform, as we should 
anyway. The other hunk in the patch tweaks the only other remaining 
warning_at pair I found in the file, should be a rather benign one, 
but better adjusting it too.


Tested x86_64-linux.

Ping rather early because this one should be rather straightforward...

Thanks!
Paolo.

Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-20 Thread Richard Earnshaw

On 20/05/14 14:12, Ramana Radhakrishnan wrote:
> 
>>
>> The following PR is opened for this problem:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61223
>>
>> Thumb1 failure was also detected and reported in pr60758.
>> I've proposed a thumb1 bugfix there. Regtest for the fix currently is in
>> progress.
> 
> Patches must be proposed on gcc-patches and / or in this case 

Never OR.  Patches must always go to gcc-patches.  They may *also* go to
other lists, if appropriate.

R.

> libstdc++-patches mailing list. Attaching patches to bugzilla does not 
> ensure review.
> 
> Thanks,
> Ramana
> 
>>
>> Best regards,
>> Merzlyakov Alexey
>>
>

PR 61210: recursive template substitution depending on system compiler

2014-05-20 Thread Richard Sandiford

PR 61210 is about a case where a templated wi:: function had "x | y"
and where both "x" and "y" needed temporary variables.  The recursive
instantation of x and y were done as two arguments to the same function
call, which meant that the order of instantiation depended on the host
compiler.  The uids of the temporary variables could therefore be in a
different (but still valid) order than when compiled with the stage1
compiler, which caused code differences later when the uids were used
as tie-breakers to get a stable sort.

I've gone through pt.c a couple of times trying to find all cases
of this, but I might have missed some.

Tested on x86_64-linux-gnu.  Also tested by using clang as the stage1
compiler in an i486-pc-linux-gnu bootstrap.  OK to install?

Thanks,
Richard


gcc/cp/
PR bootstrap/61210
* pt.c (tsubst_copy, tsubst_omp_for_iterator, tsubst_expr)
(tsubst_copy_and_build): Perform recursive substitutions in a
deterministic order.

Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c 2014-05-19 18:27:03.510024737 +0100
+++ gcc/cp/pt.c 2014-05-19 22:17:17.854250621 +0100
@@ -12731,9 +12731,11 @@ tsubst_copy (tree t, tree args, tsubst_f
 case IMPLICIT_CONV_EXPR:
 case CONVERT_EXPR:
 case NOP_EXPR:
-  return build1
-   (code, tsubst (TREE_TYPE (t), args, complain, in_decl),
-tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl));
+  {
+   tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   return build1 (code, type, op0);
+  }
 
 case SIZEOF_EXPR:
   if (PACK_EXPANSION_P (TREE_OPERAND (t, 0)))
@@ -12801,9 +12803,11 @@ tsubst_copy (tree t, tree args, tsubst_f
 case REALPART_EXPR:
 case IMAGPART_EXPR:
 case PAREN_EXPR:
-  return build1
-   (code, tsubst (TREE_TYPE (t), args, complain, in_decl),
-tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl));
+  {
+   tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   return build1 (code, type, op0);
+  }
 
 case COMPONENT_REF:
   {
@@ -12877,24 +12881,26 @@ tsubst_copy (tree t, tree args, tsubst_f
 case PREINCREMENT_EXPR:
 case POSTDECREMENT_EXPR:
 case POSTINCREMENT_EXPR:
-  return build_nt
-   (code, tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl),
-tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl));
+  {
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   tree op1 = tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl);
+   return build_nt (code, op0, op1);
+  }
 
 case SCOPE_REF:
-  return build_qualified_name (/*type=*/NULL_TREE,
-  tsubst_copy (TREE_OPERAND (t, 0),
-   args, complain, in_decl),
-  tsubst_copy (TREE_OPERAND (t, 1),
-   args, complain, in_decl),
-  QUALIFIED_NAME_IS_TEMPLATE (t));
+  {
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   tree op1 = tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl);
+   return build_qualified_name (/*type=*/NULL_TREE, op0, op1,
+QUALIFIED_NAME_IS_TEMPLATE (t));
+  }
 
 case ARRAY_REF:
-  return build_nt
-   (ARRAY_REF,
-tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl),
-tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl),
-NULL_TREE, NULL_TREE);
+  {
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   tree op1 = tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl);
+   return build_nt (ARRAY_REF, op0, op1, NULL_TREE, NULL_TREE);
+  }
 
 case CALL_EXPR:
   {
@@ -12912,29 +12918,29 @@ tsubst_copy (tree t, tree args, tsubst_f
 case PSEUDO_DTOR_EXPR:
 case VEC_PERM_EXPR:
   {
-   r = build_nt
- (code, tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl),
-  tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl),
-  tsubst_copy (TREE_OPERAND (t, 2), args, complain, in_decl));
+   tree op0 = tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl);
+   tree op1 = tsubst_copy (TREE_OPERAND (t, 1), args, complain, in_decl);
+   tree op2 = tsubst_copy (TREE_OPERAND (t, 2), args, complain, in_decl);
+   r = build_nt (code, op0, op1, op2);
TREE_NO_WARNING (r) = TREE_NO_WARNING (t);
return r;
   }
 
 case NEW_EXPR:
   {
-   r = build_nt
-   (code, tsubst_copy (TREE_OPERAND (t, 0), args, complain, in_decl),
-tsubst_copy (TREE_OPERAND (t, 1), args,

Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-20 Thread Ramana Radhakrishnan





The following PR is opened for this problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61223

Thumb1 failure was also detected and reported in pr60758.
I've proposed a thumb1 bugfix there. Regtest for the fix currently is in
progress.


Patches must be proposed on gcc-patches and / or in this case 
libstdc++-patches mailing list. Attaching patches to bugzilla does not 
ensure review.


Thanks,
Ramana



Best regards,
Merzlyakov Alexey

Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-20 Thread Alexey Merzlyakov



On 20.05.2014 16:25, Richard Earnshaw wrote:

On 16/05/14 14:56, Alexey Merzlyakov wrote:

On 07.05.2014 13:28, Ramana Radhakrishnan wrote:

On 05/07/14 09:19, Yury Gribov wrote:

 Original Message 
Subject: [PING] [PATCH] Fix for PR libstdc++/60758
Date: Thu, 17 Apr 2014 17:48:12 +0400
From: Alexey Merzlyakov 
To: Ramana Radhakrishnan 
CC: gcc-patches@gcc.gnu.org , Viacheslav
Garbuzov , Yury Gribov 

Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html

This is OK to apply if no regressions.

Thanks,
Ramana


Thanks in advance.

Best regards,
Merzlyakov Alexey



I have re-tested it again on arm-linux-gnueabi(sf) - no regressions.
The change was committed to mainline as revision 210515.

Best regards,
Merzlyakov Alexey



Unfortunately, this breaks thumb1.  Pop instructions there can only
contain low registers and the PC.

R.



The following PR is opened for this problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61223

Thumb1 failure was also detected and reported in pr60758.
I've proposed a thumb1 bugfix there. Regtest for the fix currently is in 
progress.


Best regards,
Merzlyakov Alexey

Re: [PATCH][ARM] Adjust arith_shiftsi for ARMv8-style

2014-05-20 Thread Richard Earnshaw

On 15/05/14 09:47, Kyrill Tkachov wrote:
> Hi all,
> 
> Shifted arithmetic operations can never be encoded in 16-bits in and 
> therefore 
> can not appear in Thumb2 IT blocks under ARMv8-A rules (and the -mrestrict-it 
> rules). This patch adjusts the relevant pattern for that purpose.
> 
> Tested and bootstrapped on arm-none-linux-gnueabihf and made sure no 
> performance 
> regressions on a number of benchmarks.
> 
> This is a bug (not wrong-code though) in -mrestrict-it that affects 4.9 as 
> well 
> as trunk, so is it ok to backport it there?
> 
> 
> Thanks,
> Kyrill
> 
> 2014-05-15  Kyrylo Tkachov  
> 
>  * config/arm/arm.md (arith_shiftsi): Do not predicate for
>  arm_restrict_it.
> 

OK both.

R.

[PATCH][match-and-simplify] Provide parsing error locations

2014-05-20 Thread Richard Biener


This adds $subject.

Committed to branch.

Richard.

2014-05-20  Richard Biener  

* genmatch.c (error_cb, fatal_at): New functions.
(expect, parse_expr, parse_op, parse_match_and_simplify,
main): Provide error locations.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 210638)
+++ gcc/genmatch.c  (working copy)
@@ -546,6 +546,30 @@ write_gimple (FILE *f, vec&
 
 /* libccp helpers.  */
 
+static struct line_maps *line_table;
+
+static bool
+error_cb (cpp_reader *, int, int, source_location location,
+ unsigned int, const char *msg, va_list *ap)
+{
+  const line_map *map;
+  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
+  expanded_location loc = linemap_expand_location (line_table, map, location);
+  fprintf (stderr, "%s:%d:%d error: ", loc.file, loc.line, loc.column);
+  vfprintf (stderr, msg, *ap);
+  fprintf (stderr, "\n");
+  exit (1);
+}
+
+static void
+fatal_at (const cpp_token *tk, const char *msg, ...)
+{
+  va_list ap;
+  va_start (ap, msg);
+  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  va_end (ap);
+}
+
 
 /* Read the next non-whitespace token from R.  */
 
@@ -585,8 +609,8 @@ expect (cpp_reader *r, enum cpp_ttype tk
 {
   const cpp_token *token = next (r);
   if (token->type != tk)
-fatal ("error: expected %s, got %s",
-  cpp_type2name (tk, 0), cpp_type2name (token->type, 0));
+fatal_at (token, "expected %s, got %s",
+ cpp_type2name (tk, 0), cpp_type2name (token->type, 0));
 
   return token;
 }
@@ -660,7 +684,7 @@ parse_expr (cpp_reader *r)
 op = parse_capture (r, e);
   else if (token->type == CPP_COLON
   && !(token->flags & PREV_WHITE))
-fatal ("not implemented: predicates on expressions");
+fatal_at (token, "not implemented: predicates on expressions");
   else
 op = e;
   do
@@ -672,8 +696,8 @@ parse_expr (cpp_reader *r)
{
  operator_id *opr = static_cast  (e->operation->op);
  if (e->ops.length () != opr->get_required_nargs ())
-   fatal ("got %d operands instead of the required %d",
-  e->ops.length (), opr->get_required_nargs ());
+   fatal_at (token, "got %d operands instead of the required %d",
+ e->ops.length (), opr->get_required_nargs ());
}
  return op;
}
@@ -789,10 +813,10 @@ parse_op (cpp_reader *r)
}
   else if (token->type != CPP_COLON
   && token->type != CPP_ATSIGN)
-   fatal ("expected expression or predicate");
+   fatal_at (token, "expected expression or predicate");
   /* optionally followed by a capture and a predicate.  */
   if (token->type == CPP_COLON)
-   fatal ("not implemented: predicate on leaf operand");
+   fatal_at (token, "not implemented: predicate on leaf operand");
   if (token->type == CPP_ATSIGN)
op = parse_capture (r, op);
 }
@@ -818,7 +842,10 @@ parse_match_and_simplify (cpp_reader *r)
   asprintf (&tem, "anon_%d", ++cnt);
   id = tem;
 }
+  const cpp_token *loc = peek (r);
   struct operand *match = parse_op (r);
+  if (match->type != operand::OP_EXPR)
+fatal_at (loc, "expected uncaptured expression");
   token = peek (r);
   /* Conditional if ()  */
   struct operand *ifexpr = NULL;
@@ -826,11 +853,9 @@ parse_match_and_simplify (cpp_reader *r)
 {
   const char *tem = get_ident (r);
   if (strcmp (tem, "if") != 0)
-   fatal ("expected 'if' or expression");
+   fatal_at (token, "expected 'if' or expression");
   ifexpr = parse_c_expr (r, CPP_OPEN_PAREN);
 }
-  if (match->type != operand::OP_EXPR)
-fatal ("expected uncaptured expression");
   return new simplify (id, match, ifexpr, parse_op (r));
 }
 
@@ -846,7 +871,6 @@ main(int argc, char **argv)
 {
   cpp_reader *r;
   const cpp_token *token;
-  struct line_maps *line_table;
 
   progname = "genmatch";
 
@@ -859,6 +883,8 @@ main(int argc, char **argv)
   line_table->round_alloc_size = round_alloc_size;
 
   r = cpp_create_reader (CLK_GNUC99, NULL, line_table);
+  cpp_callbacks *cb = cpp_get_callbacks (r);
+  cb->error = error_cb;
 
   if (!cpp_read_main_file (r, argv[1]))
 return 1;
@@ -895,7 +921,7 @@ main(int argc, char **argv)
   if (strcmp (id, "match_and_simplify") == 0)
simplifiers.safe_push (parse_match_and_simplify (r));
   else
-   fatal ("expected 'match_and_simplify'");
+   fatal_at (token, "expected 'match_and_simplify'");
 
   eat_token (r, CPP_CLOSE_PAREN);
 }

[build, doc, testsuite] Centralise clearing hardware capabilities with Sun ld

2014-05-20 Thread Rainer Orth

Rainer Orth  writes:

> Prompted by the recent failures of c-c++-common/gomp/pr60823-2.c on
> Solaris/x86 with Sun as
>
>   http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00943.html
>
> I've reworked the clearing of hardware capabilities via linker maps for
> Sun ld, which is currently replicated in several places all over the
> testsuite.  I've chosen to use TEST_ALWAYS_FLAGS or ALWAYS_CXXFLAGS to
> pass on the necessary linker flags.

It turned out that more is needed to deal with the new gcc.dg/gomp and
g++.dg/gomp failures:

https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01185.html

The following patch implements what I've outlined there: it introduces a
Solaris-specific new -mclear-hwcap option, checking which version of
the mapfile syntax works, if any.  This way, all the duplication in the
testsuite and libitm can go.

Do deal with the OpenMP declare simd failures, this option/mapfile is
automatically activated with -fopenmp*.

Initial testing has concluded successfully, also on an Solaris 10/x86
system where ld only understands the old mapfile syntax.  Mainline
bootstraps on i386-pc-solaris2.11 (as/ld, gas/ld, gas/gld),
i386-pc-solaris2.10 (as/ld, gas/ld), and sparc-sun-solaris2.11 (as/ld)
are still running.  x86_64-unknown-linux-gnu didn't finish due to
unrelated libgo breakage.  Also, i386-pc-solaris2.9 bootstrap running on
the 4.9 branch which is equally affected by the testsuite failures.

Ok for mainline and 4.9 branch if those pass?

Thanks.
Rainer

2014-05-13  Rainer Orth  

gcc:
* configure.ac ($gcc_cv_ld_clearcap): New test.
* configure: Regenerate.
* config.in: Regenerate.
* config/sol2.opt (mclear-hwcap): New option.
* config/sol2.h (LINK_CLEARCAP_SPEC): Define.
* config/sol2-clearcap.map: Moved here from
testsuite/gcc.target/i386/clearcap.map.
* config/sol2-clearcapv2.map: Move here from
gcc.target/i386/clearcapv2.map.
* config/t-sol2 (install): Depend on install-clearcap-map.
(install-clearcap-map): New target.
* doc/invoke.texi (Option Summary, Solaris 2 Options): Document
-mclear-hwcap.

gcc/testsuite:
* lib/clearcap.exp: New file.
* gcc.dg/vect/vect.exp: Load clearcap.exp.
Remove clearcap_ldflags handling.
Call clearcap-init, clearcap-finish.
* gcc.target/i386/i386.exp: Likewise.
* gcc.target/i386/clearcap.map: Move to ../config/sol2-clearcap.map.
* gcc.target/i386/clearcapv2.map: Move to
../config/sol2-clearcapv2.map.
* gcc.target/x86_64/abi/avx/abi-avx.exp: Likewise.
* gcc.target/x86_64/abi/avx512f/abi-avx512f.exp: Likewise.

libitm:
* acinclude.m4 (LIBITM_CHECK_LINKER_HWCAP): Check for
-mclear-hwcap instead.
* configure: Regenerate.
* clearcap.map: Remove.

# HG changeset patch
# Parent 8bc292d22fc4feb31bf4225a4f6d3ee7f39c9a68
Centralise clearing hardware capabilities with Sun ld

diff --git a/gcc/testsuite/gcc.target/i386/clearcap.map b/gcc/config/sol2-clearcap.map
rename from gcc/testsuite/gcc.target/i386/clearcap.map
rename to gcc/config/sol2-clearcap.map
--- a/gcc/testsuite/gcc.target/i386/clearcap.map
+++ b/gcc/config/sol2-clearcap.map
@@ -1,3 +1,2 @@
-# clear all hardware capabilities emitted by Sun as: the tests here
-# guard against execution at runtime
+# Clear all hardware capabilities emitted by Sun as.
 hwcap_1 = V0x0 OVERRIDE;
diff --git a/gcc/testsuite/gcc.target/i386/clearcapv2.map b/gcc/config/sol2-clearcapv2.map
rename from gcc/testsuite/gcc.target/i386/clearcapv2.map
rename to gcc/config/sol2-clearcapv2.map
--- a/gcc/testsuite/gcc.target/i386/clearcapv2.map
+++ b/gcc/config/sol2-clearcapv2.map
@@ -1,6 +1,6 @@
-# clear all hardware capabilities emitted by Sun as: the tests here
-# guard against execution at runtime
-# uses mapfile v2 syntax which is the only way to clear AT_SUN_CAP_HW2 flags
+# Clear all hardware capabilities emitted by Sun as.
+#
+# Uses mapfile v2 syntax which is the only way to clear AT_SUN_CAP_HW2 flags.
 $mapfile_version 2
 CAPABILITY {
   HW = ;
diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -268,12 +268,21 @@ along with GCC; see the file COPYING3.  
 #define LINK_LIBGCC_MAPFILE_SPEC ""
 #endif

+/* Clear hardware capabilities, either explicitly or with OpenMP:
+   #pragma openmp declare simd creates clones for SSE2, AVX, and AVX2.  */
+#ifdef HAVE_LD_CLEARCAP
+#define LINK_CLEARCAP_SPEC " %{mclear-hwcap|fopenmp*:-M %sclearcap.map}"
+#else
+#define LINK_CLEARCAP_SPEC ""
+#endif
+
 #undef  LINK_SPEC
 #define LINK_SPEC \
   "%{h*} %{v:-V} \
%{!shared:%{!static:%{rdynamic: " RDYNAMIC_SPEC "}}} \
%{static:-dn -Bstatic} \
-   %{shared:-G -dy %{!mimpure-text:-z text}} " LINK_LIBGCC_MAPFILE_SPEC " \
+   %{shared:-G -dy %{!mimpure-text:-z text}} " \
+   LINK_LIBGCC_MAPFILE_SPEC LINK_CLEARCAP_SPEC " \
%{symbolic:-Bsymbolic -G -dy

Re: [C++ Patch] PR 58664

2014-05-20 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Fix ARM NAN fraction bits

2014-05-20 Thread Richard Earnshaw

On 20/05/14 08:57, Richard Biener wrote:
> On Tue, 20 May 2014, Maciej W. Rozycki wrote:
> 
>> Ian,
>>
>> On Sat, 17 May 2014, Richard Biener wrote:
>>
>>> On May 17, 2014 12:22:23 AM CEST, "Maciej W. Rozycki" 
>>>  wrote:
 On Fri, 16 May 2014, Joseph S. Myers wrote:

>> 2014-05-16  Maciej W. Rozycki  
>>
>>  PR libgcc/60166
>>  * sysdeps/arm/soft-fp/sfp-machine.h (_FP_NANFRAC_S, _FP_NANFRAC_D)
>>  (_FP_NANSIGN_Q): Set the quiet bit.
>
> OK for glibc.

 Joseph, thanks for your review, this is now in.

 Richard, you wrote yesterday that pushing changes to 4.8 would require 
 explicit approval from release managers, however it is not clear to me
 who 
 they are for that branch.  This fix corrects a regression introduced
 after 
 4.8.2.  Can you approve it?  If not, then who can?
>>>
>>> If it's not broken in 4.8.2 but broken on the branch head then it's OK 
>>> for the branch.
>>
>>  I thought I'd double-check with you that it is fine to push this change 
>> to trunk first.  OK to apply?
> 
> Of course it should go to trunk (and 4.9 branch) first, but I'm not
> the one to approve that.
> 

I can...  Based on Joseph's review, OK.

R.

> Richard.
> 
>> 2014-05-20  Maciej W. Rozycki  
>>
>>  PR libgcc/60166
>>  libgcc/
>>  * config/arm/sfp-machine.h (_FP_NANFRAC_S, _FP_NANFRAC_D)
>>  (_FP_NANSIGN_Q): Set the quiet bit.
>>
>>   Maciej
>>
>> gcc-soft-fp-arm-nanfrac.diff
>> Index: gcc-fsf-trunk-quilt/libgcc/config/arm/sfp-machine.h
>> ===
>> --- gcc-fsf-trunk-quilt.orig/libgcc/config/arm/sfp-machine.h 2014-05-16 
>> 15:59:06.0 +0100
>> +++ gcc-fsf-trunk-quilt/libgcc/config/arm/sfp-machine.h  2014-05-20 
>> 01:23:36.618434199 +0100
>> @@ -21,10 +21,10 @@ typedef int __gcc_CMPtype __attribute__ 
>>  
>>  /* According to RTABI, QNAN is only with the most significant bit of the
>> significand set, and all other significand bits zero.  */
>> -#define _FP_NANFRAC_H   0
>> -#define _FP_NANFRAC_S   0
>> -#define _FP_NANFRAC_D   0, 0
>> -#define _FP_NANFRAC_Q   0, 0, 0, 0
>> +#define _FP_NANFRAC_H   _FP_QNANBIT_H
>> +#define _FP_NANFRAC_S   _FP_QNANBIT_S
>> +#define _FP_NANFRAC_D   _FP_QNANBIT_D, 0
>> +#define _FP_NANFRAC_Q   _FP_QNANBIT_Q, 0, 0, 0
>>  #define _FP_NANSIGN_H   0
>>  #define _FP_NANSIGN_S   0
>>  #define _FP_NANSIGN_D   0
>>
>>
>

Re: [PATCH] [PING^2] Fix for PR libstdc++/60758

2014-05-20 Thread Richard Earnshaw

On 16/05/14 14:56, Alexey Merzlyakov wrote:
> On 07.05.2014 13:28, Ramana Radhakrishnan wrote:
>> On 05/07/14 09:19, Yury Gribov wrote:
>>>  Original Message 
>>> Subject: [PING] [PATCH] Fix for PR libstdc++/60758
>>> Date: Thu, 17 Apr 2014 17:48:12 +0400
>>> From: Alexey Merzlyakov 
>>> To: Ramana Radhakrishnan 
>>> CC: gcc-patches@gcc.gnu.org , Viacheslav
>>> Garbuzov , Yury Gribov 
>>>
>>> Hi,
>>>
>>> This fixes infinite backtrace in __cxa_end_cleanup().
>>> Regtest was finished with no regressions on arm-linux-gnueabi(sf).
>>>
>>> The patch posted at:
>>> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html
>>
>> This is OK to apply if no regressions.
>>
>> Thanks,
>> Ramana
>>
>>>
>>> Thanks in advance.
>>>
>>> Best regards,
>>> Merzlyakov Alexey
>>>
>>>
>>
> I have re-tested it again on arm-linux-gnueabi(sf) - no regressions.
> The change was committed to mainline as revision 210515.
> 
> Best regards,
> Merzlyakov Alexey
> 
> 

Unfortunately, this breaks thumb1.  Pop instructions there can only
contain low registers and the PC.

R.

[PATCH][match-and-simplify] Reject outermost captures

2014-05-20 Thread Richard Biener


We can't really code-generate those (and thus such patterns are
simply skipped during code-gen ...).  Reject them early.

Committed to the branch.

Richard.

2014-05-20  Richard Biener  

* genmatch.c (parse_match_and_simplify): Reject outermost
expressions that are captured or not expressions.
* match.pd: Rewrite negate contracting patterns to avoid
capturing the outermost expression.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 210637)
+++ gcc/match.pd(working copy)
@@ -26,22 +26,18 @@ along with GCC; see the file COPYING3.
 /* ???  Have match_and_simplify groups guarded with common
predicates on the outermost type?  */
 
-/* Contract negates.
-   ???  !TYPE_SATURATING condition missing.  */
+/* Contract negates.  */
 (match_and_simplify
-  (PLUS_EXPR @0 (NEGATE_EXPR @1))
-  /* Ugh, free form mixed in ... ;)  Want (if ...) instead?  */
-  if (!TYPE_SATURATING (TREE_TYPE (@0)))
-  (MINUS_EXPR @0 @1))
+  (plus @0 (negate @1))
+  if (!TYPE_SATURATING (type))
+  (minus @0 @1))
 (match_and_simplify
-  /* ???  Allow (MINUS:!TYPE_SATURATING @0 (NEGATE @1)), thus
- type predicates on operands?  Thus also :INTEGRAL_TYPE_P
- or @0:INTEGRAL_TYPE_P?  */
-  (MINUS @0 (NEGATE @1))
-  (PLUS @0 @1))
+  (minus @0 (negate @1))
+  if (!TYPE_SATURATING (type))
+  (plus @0 @1))
 (match_and_simplify
-  (plus@2 (negate @0) @1)
-  if (!TYPE_SATURATING (TREE_TYPE (@2)))
+  (plus (negate @0) @1)
+  if (!TYPE_SATURATING (type))
   (minus @1 @0))
 
 /* Change to even more free-form like
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 210637)
+++ gcc/genmatch.c  (working copy)
@@ -829,6 +829,8 @@ parse_match_and_simplify (cpp_reader *r)
fatal ("expected 'if' or expression");
   ifexpr = parse_c_expr (r, CPP_OPEN_PAREN);
 }
+  if (match->type != operand::OP_EXPR)
+fatal ("expected uncaptured expression");
   return new simplify (id, match, ifexpr, parse_op (r));
 }

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-20 Thread Kirill Yukhin

Hello,
On 19 May 09:58, H.J. Lu wrote:
> On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak  wrote:
> > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu  wrote:
> >
>  Uros,
>  I am looking into libreoffice size and the data alignment seems to make 
>  huge
>  difference. Data section has grown from 5.8MB to 6.3MB in between GCC 
>  4.8 and 4.9,
>  while clang produces 5.2MB.
> 
>  The two patches I posted to not align vtables and RTTI reduces it to 
>  5.7MB, but
>  But perhaps we want to revisit the alignment rules.  The optimization 
>  manuals
>  usually care only about performance critical loops.  Perhaps we can make 
>  the
>  rules to align only bigger datastructures, or so at least for -O2.
> >>>
> >>> Based on the above quote, "Misaligned data access can incur
> >>> significant performance penalties." and the fact that this particular
> >>> alignment rule has some compatibility issues with previous versions of
> >>> gcc (these were later fixed by Jakub), I'd rather leave this rule as
> >>> is. However, if the access is from the cold section, we can perhaps
> >>> avoid extra alignment, while avoiding those compatibility issues.
> >>>
> >>
> >> It is excessive to align
> >>
> >> struct foo
> >> {
> >>   int x1;
> >>   int x2;
> >>   char x3;
> >>   int x4;
> >>   int x5;
> >>   char x6;
> >>   int x7;
> >>   int x8;
> >> };
> >>
> >> to 32 bytes and align
> >>
> >> struct foo
> >> {
> >>   int x1;
> >>   int x2;
> >>   char x3;
> >>   int x4;
> >>   int x5;
> >>   char x6;
> >>   int x7[9];
> >>   int x8;
> >> };
> >>
> >> to 64 bytes.  What performance gain does it provide?
> >
> > Avoids "significant performance penalties," perhaps?
> >
> 
> Kirill, do we have performance data for excessive alignment
> vs ABI alignment?
Nope, we have no actual data showing performance impact on such changes,
sorry.

We may try such a change on HSW machine (on Spec 2006), will it be useful?

--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -26576,7 +26576,7 @@ ix86_data_alignment (tree type, int align, bool opt)
  used to assume.  */

   int max_align_compat
-= optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
+= optimize_size ? BITS_PER_WORD : MIN (128, MAX_OFILE_ALIGNMENT);

   /* A data structure, equal or greater than the size of a cache line
  (64 bytes in the Pentium 4 and other recent Intel processors, including


> -- 
> H.J.

--
Thanks, K

Re: add dbgcnt and opt-info support for devirtualization

2014-05-20 Thread Richard Biener

On Mon, May 19, 2014 at 5:24 PM, Xinliang David Li  wrote:
> Sorry about it. Here is the patch.  There is one remaining case where
> cgraph_dump_file and dump_enable_p are checked separately --
> cgraph_dump_file is set up differently from 'dump_file'.

But there you check with an else if, so if you do -fdump-ipa-cgraph
then suddenly -fopt-info will stop reporting?  At least in the cgraphunit.c
part of the patch.

I'm ok with the rest of the patch.

Thanks,
Richard.

> David
>
>
>
> On Mon, May 19, 2014 at 2:21 AM, Richard Biener
>  wrote:
>> On Fri, May 16, 2014 at 11:19 PM, Xinliang David Li  
>> wrote:
>>> Modified the patch according to yours and Richard's feedback. PTAL.
>>
>> ENOPATCH.
>>
>> Btw, I don't see any issue with leaking node order to opt-report.
>>
>> Richard.
>>
>>> thanks,
>>>
>>> David
>>>
>>> On Fri, May 16, 2014 at 9:03 AM, Jan Hubicka  wrote:
> Hi, debugging runtime bugs due to devirtualization can be hard for
> very large C++ programs with complicated class hierarchy. This patch
> adds the support to report this high level transformation via
> -fopt-info (not hidden inside dump file) and the ability the do binary
> search with cutoff.
>
> Ok for trunk after build and test?

 Seems resonable to me.
>
> thanks,
>
> David

> Index: ChangeLog
> ===
> --- ChangeLog (revision 210479)
> +++ ChangeLog (working copy)
> @@ -1,3 +1,18 @@
> +2014-05-15  Xinliang David Li  
> +
> + * cgraphunit.c (walk_polymorphic_call_targets): Add
> + dbgcnt and fopt-info support.
> + 2014-05-15  Xinliang David Li  
> +
> + * cgraphunit.c (walk_polymorphic_call_targets): Add
> + dbgcnt and fopt-info support.
> + * ipa-prop.c (ipa_make_edge_direct_to_target): Ditto.
> + * ipa-devirt.c (ipa_devirt): Ditto.
> + * ipa.c (walk_polymorphic_call_targets): Ditto.
> + * gimple-fold.c (fold_gimple_assign): Ditto.
> + (gimple_fold_call): Ditto.
> + * dbgcnt.def: New counter.
> +
>  2014-05-15  Martin Jambor  
>
>   PR ipa/61085
> Index: ipa-prop.c
> ===
> --- ipa-prop.c(revision 210479)
> +++ ipa-prop.c(working copy)
> @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.
>  #include "ipa-utils.h"
>  #include "stringpool.h"
>  #include "tree-ssanames.h"
> +#include "dbgcnt.h"
>
>  /* Intermediate information about a parameter that is only useful during 
> the
> run of ipa_analyze_node and is not kept afterwards.  */
> @@ -2494,6 +2495,13 @@ ipa_make_edge_direct_to_target (struct c
>   fprintf (dump_file, "ipa-prop: Discovered direct call to 
> non-function"
>   " in %s/%i, making it unreachable.\n",
>ie->caller->name (), ie->caller->order);
> +  else if (dump_enabled_p ())
> + {
> +   location_t loc = gimple_location (ie->call_stmt);
> +   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
> +"Discovered direct call to non-function in 
> %s, "
> +"making it unreachable\n", ie->caller->name 
> ());

 Perhaps "turning it to __builtin_unreachable call" and similarly in the 
 other cases
 we introduce __builtin_unreachable? I think that could be easier for user 
 to work
 out.

 What king of problems in devirtualizatoin you are seeing?


 Honza

Re: [RFC] HOST_WIDE_INT transition steps

2014-05-20 Thread Richard Biener

On Tue, 20 May 2014, Eric Botcazou wrote:

> > Same as for going C++.
> 
> Not to the same extent, this will be worse because done en masse throughout 
> the code instead of gradually.

Like the gimple -> gimple * change pending or the various
gimple -> gswitch,glabel,etc. stuff?  It's on a similar scale at lest.

> > Make the code base easier to understand for newcomers.  It's also a
> > documentation improvement (you see what a HOST_WIDE_INT really is),
> > alongside with [u]int64_t being less to type ...
> 
> I personally find the abstraction and the separation with the other, more 
> mundane types useful, but I guess that it's a matter of habit.

Yeah.  I'm happy of converting only "obvious" cases (or converting
them separately at least).  Of course there may be not that many
decoupled pieces ...

Richard.

Re: [RFC] HOST_WIDE_INT transition steps

2014-05-20 Thread Eric Botcazou

> Same as for going C++.

Not to the same extent, this will be worse because done en masse throughout 
the code instead of gradually.

> Make the code base easier to understand for newcomers.  It's also a
> documentation improvement (you see what a HOST_WIDE_INT really is),
> alongside with [u]int64_t being less to type ...

I personally find the abstraction and the separation with the other, more 
mundane types useful, but I guess that it's a matter of habit.

-- 
Eric Botcazou

Re: [RFC] HOST_WIDE_INT transition steps

2014-05-20 Thread Richard Biener

On Tue, 20 May 2014, Eric Botcazou wrote:

> > The following is my current idea on progressing on the HOST_WIDE_INT
> > removal
> > 
> > 1) https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00381.html (ping)
> > 
> > 2) make sure [u]int64_t is available and use that to define HOST_WIDE_INT
> > 
> > 3) s/HOST_WIDE_INT/int64_t/ (same for unsigned HOST_WIDE_INT)
> 
> Does 3) really buy us something?  That would make backports painful I think.

Same as for going C++.  Make the code base easier to understand for
newcomers.  It's also a documentation improvement (you see what
a HOST_WIDE_INT really is), alongside with [u]int64_t being less
to type ...

Btw, all of the current pending rewrite patches will make backports
painful.  Oh well.

Richard.

1 2 >

1 - 100 of 123 matches

Mail list logo