update address taken: don't drop clobbers

2014-06-28 Thread Marc Glisse

Hello,

we currently drop clobbers on variables whose address is not taken 
anymore. However, rewrite_stmt has code to replace them with an SSA_NAME 
with a default definition (an uninitialized variable), and I believe 
rewrite_update_stmt should do the same. This allows us to warn sometimes 
(see testcase), but during the debugging I also noticed several places 
where it allowed CCP to simplify further PHIs, so this is also an 
optimization.


In an earlier version of the patch, I was using
get_or_create_ssa_default_def (cfun, sym);
(I was reusing the same variable). This passed bootstrap+testsuite on all 
languages except for ada. Indeed, the compiler wanted to coalesce several 
SSA_NAMEs, including those new ones, in out-of-ssa, but couldn't. There 
are abnormal PHIs involved. Maybe it shouldn't have insisted on coalescing 
an undefined ssa_name, maybe something should have prevented us from 
reaching such a situation, but creating a new variable was the simplest 
workaround.


Some things could be done to improve the error message in uninit:
- getting the location of the variable,
- differenciating uninitialized from clobbered,
but that can come later.

Bootstrap+testsuite (all,obj-c++,ada,go) on x86_64-unknown-linux-gnu.

2014-06-30  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/60770
gcc/
* tree-ssa.c (execute_update_addresses_taken): Don't drop clobbers.
* tree-into-ssa.c (maybe_register_def): Replace clobbers with a
default definition.
gcc/testsuite/
* gcc.dg/tree-ssa/pr60770-1.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c   (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O -Wall } */
+
+int f(int n){
+  int*p;
+  {
+int yyy=n;
+p=yyy;
+  }
+  return *p; /* { dg-warning yyy } */
+}
Index: gcc/tree-into-ssa.c
===
--- gcc/tree-into-ssa.c (revision 212109)
+++ gcc/tree-into-ssa.c (working copy)
@@ -1831,26 +1831,38 @@ maybe_register_def (def_operand_p def_p,
 {
   tree def = DEF_FROM_PTR (def_p);
   tree sym = DECL_P (def) ? def : SSA_NAME_VAR (def);
 
   /* If DEF is a naked symbol that needs renaming, create a new
  name for it.  */
   if (marked_for_renaming (sym))
 {
   if (DECL_P (def))
{
- tree tracked_var;
-
- def = make_ssa_name (def, stmt);
+ if (gimple_clobber_p (stmt)  is_gimple_reg (sym))
+   {
+ /* Replace clobber stmts with a default def.  Create a new
+variable so we don't later think we must coalesce, which would
+fail with some ada abnormal PHIs.  Still, we try to keep a
+similar name so error messages make sense.  */
+ unlink_stmt_vdef (stmt);
+ gsi_replace (gsi, gimple_build_nop (), true);
+ tree id = DECL_NAME (sym);
+ const char* name = id ? IDENTIFIER_POINTER (id) : 0;
+ tree newvar = create_tmp_var (TREE_TYPE (sym), name);
+ def = get_or_create_ssa_default_def (cfun, newvar);
+   }
+ else
+   def = make_ssa_name (def, stmt);
  SET_DEF (def_p, def);
 
- tracked_var = target_for_debug_bind (sym);
+ tree tracked_var = target_for_debug_bind (sym);
  if (tracked_var)
{
  gimple note = gimple_build_debug_bind (tracked_var, def, stmt);
  /* If stmt ends the bb, insert the debug stmt on the single
 non-EH edge from the stmt.  */
  if (gsi_one_before_end_p (gsi)  stmt_ends_bb_p (stmt))
{
  basic_block bb = gsi_bb (gsi);
  edge_iterator ei;
  edge e, ef = NULL;
Index: gcc/tree-ssa.c
===
--- gcc/tree-ssa.c  (revision 212109)
+++ gcc/tree-ssa.c  (working copy)
@@ -1607,32 +1607,20 @@ execute_update_addresses_taken (void)
rhs = gimple_assign_rhs1 (stmt);
if (gimple_assign_lhs (stmt) != lhs
 !useless_type_conversion_p (TREE_TYPE (lhs),
   TREE_TYPE (rhs)))
  rhs = fold_build1 (VIEW_CONVERT_EXPR,
 TREE_TYPE (lhs), rhs);
 
if (gimple_assign_lhs (stmt) != lhs)
  gimple_assign_set_lhs (stmt, lhs);
 
-   /* For var ={v} {CLOBBER}; where var lost
-  TREE_ADDRESSABLE just remove the stmt.  */
-   if (DECL_P (lhs)
-TREE_CLOBBER_P (rhs)
-bitmap_bit_p (suitable_for_renaming, DECL_UID (lhs)))
- {
-   unlink_stmt_vdef (stmt

Re: [i386] Replace builtins with vector extensions

2014-06-28 Thread Marc Glisse

On Sat, 28 Jun 2014, Ulrich Drepper wrote:


On Sat, Jun 28, 2014 at 6:42 AM, Marc Glisse marc.gli...@inria.fr wrote:

Ping,

nobody has an opinion on this? Or some explanation why I am mistaken to
believe that #pragma target makes it safer now?

It would enable a number of optimizations, like constant propagation, FMA
contraction, etc. It would also allow us to remove several builtins.


I see no problem with using the array-type access to the registers.

As for replacing the builtins with arithmetic operators: I appreciate
the possibility for optimization.  But is there any chance the calls
could not end up being implemented with a vector instruction?  I think
that would be bad.  The intrinsics should be a way to guarantee that
the programmer can create vector instructions.  Otherwise we might
just not support them.


There is always a risk, but then even with builtins I think there was a 
small risk that an RTL optimization would mess things up. It is indeed 
higher if we expose the operation to the optimizers earlier, but it would 
be a bug if an optimization replaced a vector operation by something 
worse. Also, I am only proposing to handle the most trivial operations 
this way, not more complicated ones (like v[0]+=s) where we would be 
likely to fail generating the right instruction. And the pragma should 
ensure that the function will always be compiled in a mode where the 
vector instruction is available.


ARM did the same and I don't think I have seen a bug reporting a 
regression about it (I haven't really looked though).


Thanks,

--
Marc Glisse


Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)

2014-06-26 Thread Marc Glisse

On Thu, 26 Jun 2014, Jakub Jelinek wrote:


So like this?  I've also changed get_compute_type so that it will DTRT
even for -mavx and V4DImode vectors, so e.g. f5/f6/f8 routines in
avx-pr57233.c improve.  Also, even for shifts by scalar, if e.g. target
doesn't have shifts by scalar at all, and only has narrower vector by vector
shifts, it should handle this case too.


All that? Cool!


@@ -1455,11 +1507,83 @@ expand_vector_operations_1 (gimple_stmt_
{
  op = optab_for_tree_code (code, type, optab_scalar);

+ compute_type = get_compute_type (code, op, type);
+ if (compute_type == type)
+   return;
  /* The rtl expander will expand vector/scalar as vector/vector
-if necessary.  Don't bother converting the stmt here.  */
- if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing
-  optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing)
+if necessary.  Pick one with wider vector type.  */
+ tree compute_vtype = get_compute_type (code, opv, type);
+ if (count_type_subparts (compute_vtype)
+  count_type_subparts (compute_type))
+   {
+ compute_type = compute_vtype;
+ op = opv;
+   }
+   }
+
+  if (code == LROTATE_EXPR || code == RROTATE_EXPR)
+   {
+ if (compute_type == NULL_TREE)
+   compute_type = get_compute_type (code, op, type);
+ if (compute_type == type)
return;
+ /* Before splitting vector rotates into scalar rotates,
+see if we can't use vector shifts and BIT_IOR_EXPR
+instead.  For vector by vector rotates we'd also
+need to check BIT_AND_EXPR and NEGATE_EXPR, punt there
+for now, fold doesn't seem to create such rotates anyway.  */
+ if (compute_type == TREE_TYPE (type)
+  !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2)))
+   {
+ optab oplv, opl, oprv, opr, opo;
+ oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector);
+ /* Right shift always has to be logical, no matter what
+signedness type has.  */
+ oprv = vlshr_optab;
+ opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default);
+ opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar);
+ oprv = lshr_optab;
+ opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);


Looks like there are some typos in there, you are assigning to oprv twice.

--
Marc Glisse


Re: [PATCH] Change default for --param allow-...-data-races to off

2014-06-25 Thread Marc Glisse

On Wed, 25 Jun 2014, Richard Biener wrote:


On Wed, Jun 25, 2014 at 10:54 AM, Jakub Jelinek ja...@redhat.com wrote:

On Wed, Jun 25, 2014 at 10:14:17AM +0200, Richard Biener wrote:

Perhaps not unsurprisingly, the patch is very similar.  Bootstrapped
and tested on x86_64-linux.  OK for trunk?


Ok - please give the C++/atomics folks a chance to comment.

This change of default behavior should also be documented in
gcc-4.10/changes.html.


Do we want to allow the store data races by default with -Ofast even in strict
conformance modes (-std=c++11, -std=c++14, -std=c11)?


I think so.  -Ofast means -Ofast (same issue with fp contraction and other
stuff -ffast-math enables that is not valid in strict conformance mode).


One thing I am missing is a -single-thread option that would allow races, 
remove atomics and thread locals, etc, without breaking conformance as 
long as no second thread is created. I hope that not too many libraries 
use threads internally in a way that would break this.


--
Marc Glisse


strlen: update datastructure when replacing malloc with calloc

2014-06-25 Thread Marc Glisse

Hello,

in my calloc patch, I forgot to update the datastructure when replacing 
malloc with calloc.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-25  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (handle_builtin_memset): Update strinfo
after replacing the statement.
gcc/testsuite/
* gcc.dg/tree-ssa/calloc-3.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+void*f(){
+char*p=__builtin_malloc(42);
+__builtin_memset(p,0,42);
+__builtin_memset(p,0,42);
+return p;
+};
+
+/* { dg-final { scan-tree-dump-not malloc optimized } } */
+/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
+/* { dg-final { scan-tree-dump-not memset optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   (revision 211968)
+++ gcc/tree-ssa-strlen.c   (working copy)
@@ -1646,20 +1646,22 @@ handle_builtin_memset (gimple_stmt_itera
   enum built_in_function code1 = DECL_FUNCTION_CODE (callee1);
   tree size = gimple_call_arg (stmt2, 2);
   if (code1 == BUILT_IN_CALLOC)
 /* Not touching stmt1 */ ;
   else if (code1 == BUILT_IN_MALLOC
operand_equal_p (gimple_call_arg (stmt1, 0), size, 0))
 {
   gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1);
   update_gimple_call (gsi1, builtin_decl_implicit (BUILT_IN_CALLOC), 2,
  size, build_one_cst (size_type_node));
+  si1-length = build_int_cst (size_type_node, 0);
+  si1-stmt = gsi_stmt (gsi1);
 }
   else
 return true;
   tree lhs = gimple_call_lhs (stmt2);
   unlink_stmt_vdef (stmt2);
   if (lhs)
 {
   gimple assign = gimple_build_assign (lhs, ptr);
   gsi_replace (gsi, assign, false);
 }


Re: [GSoC][match-and-simplify] mark some more operators as commutative

2014-06-23 Thread Marc Glisse

On Mon, 23 Jun 2014, Richard Biener wrote:


On Mon, Jun 23, 2014 at 3:32 PM, Prathamesh Kulkarni
bilbotheelffri...@gmail.com wrote:

* match.pd: Mark operators in some bitwise and plus-minus
patterns to be commutative.


/* A - (A +- B) - -+ B */
(match_and_simplify
-  (minus @0 (plus @0 @1))
+  (minus @0 (plus:c @0 @1))
  (negate @0))

seems pointless


Why? a-(a+b) and a-(b+a) are both wanted and don't appear elsewhere in the 
file, no? Should simplify to (negate @1) though.


--
Marc Glisse


Re: calloc = malloc + memset

2014-06-23 Thread Marc Glisse

On Mon, 23 Jun 2014, Jakub Jelinek wrote:


Ok for trunk, sorry for the delay.


Thanks. Richard has moved the passes a bit since then, but I still have 
exactly one spot where the testsuite is ok :-) I need strlen to be after 
dom (for calloc.C) and before vrp (for several strlenopt-*.c). I'll commit 
it tomorrow if there aren't any comments on the pass placement.


2014-06-24  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (get_string_length): Ignore malloc.
(handle_builtin_malloc, handle_builtin_memset): New functions.
(strlen_optimize_stmt): Call them.
* passes.def: Move strlen after loop+dom but before vrp.
gcc/testsuite/
* g++.dg/tree-ssa/calloc.C: New testcase.
* gcc.dg/tree-ssa/calloc-1.c: Likewise.
* gcc.dg/tree-ssa/calloc-2.c: Likewise.
* gcc.dg/strlenopt-9.c: Adapt.

--
Marc GlisseIndex: gcc/passes.def
===
--- gcc/passes.def  (revision 211886)
+++ gcc/passes.def  (working copy)
@@ -179,21 +179,20 @@ along with GCC; see the file COPYING3.
 DOM and erroneous path isolation should be due to degenerate PHI nodes.
 So rather than run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
-  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_ccp);
   /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_cse_sincos);
   NEXT_PASS (pass_optimize_bswap);
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
   NEXT_PASS (pass_asan);
@@ -232,20 +231,21 @@ along with GCC; see the file COPYING3.
  NEXT_PASS (pass_loop_prefetch);
  NEXT_PASS (pass_iv_optimize);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_tree_loop_done);
   POP_INSERT_PASSES ()
   NEXT_PASS (pass_lower_vector_ssa);
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_dominator);
+  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_vrp);
   /* The only const/copy propagation opportunities left after
 DOM and VRP should be due to degenerate PHI nodes.  So rather than
 run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_cd_dce);
   NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C
===
--- gcc/testsuite/g++.dg/tree-ssa/calloc.C  (revision 0)
+++ gcc/testsuite/g++.dg/tree-ssa/calloc.C  (working copy)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -fdump-tree-optimized } */
+
+typedef __SIZE_TYPE__ size_t;
+inline void* operator new(size_t, void* p) throw() { return p; }
+
+typedef void (*handler_t)(void);
+extern handler_t get_handle();
+
+inline void* operator new(size_t sz)
+{
+  void *p;
+
+  if (sz == 0)
+sz = 1;
+
+  while ((p = __builtin_malloc (sz)) == 0)
+{
+  handler_t handler = get_handle ();
+  if (! handler)
+throw 42;
+  handler ();
+}
+  return p;
+}
+
+struct vect {
+  int *start, *end;
+  vect(size_t n) {
+start = end = 0;
+if (n  (size_t)-1 / sizeof(int))
+  throw 33;
+if (n != 0)
+  start = static_castint* (operator new (n * sizeof(int)));
+end = start + n;
+int *p = start;
+for (size_t l = n; l  0; --l, ++p)
+  *p = 0;
+  }
+};
+
+void f (void *p, int n)
+{
+  new (p) vect(n);
+}
+
+/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
+/* { dg-final { scan-tree-dump-not malloc optimized } } */
+/* { dg-final { scan-tree-dump-not memset optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/strlenopt-9.c
===
--- gcc/testsuite/gcc.dg/strlenopt-9.c  (revision 211886)
+++ gcc/testsuite/gcc.dg/strlenopt-9.c  (working copy)
@@ -11,21 +11,21 @@ fn1 (int r)
  optimized away.  */
   return strchr (p, '\0');
 }
 
 __attribute__((noinline, noclone)) size_t
 fn2 (int r)
 {
   char *p, q[10];
   strcpy (q, abc);
   p = r ? a : q;
-  /* String length for p varies, therefore strlen below isn't
+  /* String length is constant for both alternatives, and strlen is
  optimized away.  */
   return strlen (p);
 }
 
 __attribute__((noinline, noclone)) size_t
 fn3 (char *p, int n)
 {
   int

Re: calloc = malloc + memset

2014-06-23 Thread Marc Glisse

On Mon, 23 Jun 2014, Richard Biener wrote:


On June 23, 2014 5:51:30 PM CEST, Marc Glisse marc.gli...@inria.fr wrote:

On Mon, 23 Jun 2014, Jakub Jelinek wrote:


Ok for trunk, sorry for the delay.


Thanks. Richard has moved the passes a bit since then, but I still have

exactly one spot where the testsuite is ok :-) I need strlen to be
after
dom (for calloc.C) and before vrp (for several strlenopt-*.c). I'll
commit
it tomorrow if there aren't any comments on the pass placement.


But vrp does not run at -O1 - does strlenopt?


{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },

So that's just a missed optimization at -Os, I guess.

--
Marc Glisse


Re: calloc = malloc + memset

2014-06-23 Thread Marc Glisse

On Mon, 23 Jun 2014, Andi Kleen wrote:


FWIW i believe the transformation will break a large variety of micro 
benchmarks.

calloc internally knows that memory fresh from the OS is zeroed.
But the memory may not be faulted in yet.

memset always faults in the memory.

So if you have some test like

  buf = malloc(...)
  memset(buf, ...)
  start = get_time();
  ... do something with buf
  end = get_time()

Now the times will be completely off because the measured times includes
the page faults.


Good point. I guess working around compiler optimizations is part of the 
game for micro benchmarks, and their authors would be disappointed if the 
compiler didn't mess it up regularly in new and entertaining ways ;-)


--
Marc Glisse


Re: calloc = malloc + memset

2014-06-23 Thread Marc Glisse

On Mon, 23 Jun 2014, Andi Kleen wrote:


I would prefer to not do it.


For the sake of micro benchmarks?


I'm not sure it has a lot of benefit.


It has a non-zero benefit.


If you want to keep it please make sure there is an easy way to turn
it off.


Any of these flags works:
-fdisable-tree-strlen
-fno-builtin-malloc
-fno-builtin-memset (assuming you wrote 'memset' explicitly in your code)
-fno-builtin
-ffreestanding
-O1
-Os

In the code, you can hide that the pointer passed to memset is the one 
returned by malloc by storing it in a volatile variable, or any other 
trick to hide from the compiler that we are doing memset(malloc(n),0,n).


--
Marc Glisse


[doc] Remove duplicate -Wmaybe-uninitialized

2014-06-22 Thread Marc Glisse

Hello,

a trivial patch to remove a duplicated option, you can see the second one 
4 lines below in the patch. (the mixed use of single or double spaces in 
this list is strange)


This was included in the bootstrap of another patch.

2014-06-23  Marc Glisse  marc.gli...@inria.fr

* doc/invoke.texi (Warning Options): Remove duplicated
-Wmaybe-uninitialized.

--
Marc GlisseIndex: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 211874)
+++ gcc/doc/invoke.texi (working copy)
@@ -246,21 +246,21 @@ Objective-C and Objective-C++ Dialects}.
 -Wno-deprecated -Wno-deprecated-declarations -Wdisabled-optimization  @gol
 -Wno-discarded-qualifiers @gol
 -Wno-div-by-zero -Wdouble-promotion -Wempty-body  -Wenum-compare @gol
 -Wno-endif-labels -Werror  -Werror=* @gol
 -Wfatal-errors  -Wfloat-equal  -Wformat  -Wformat=2 @gol
 -Wno-format-contains-nul -Wno-format-extra-args -Wformat-nonliteral @gol
 -Wformat-security  -Wformat-signedness  -Wformat-y2k @gol
 -Wframe-larger-than=@var{len} -Wno-free-nonheap-object -Wjump-misses-init @gol
 -Wignored-qualifiers @gol
 -Wimplicit  -Wimplicit-function-declaration  -Wimplicit-int @gol
--Winit-self  -Winline -Wmaybe-uninitialized @gol
+-Winit-self  -Winline @gol
 -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol
 -Winvalid-pch -Wlarger-than=@var{len}  -Wunsafe-loop-optimizations @gol
 -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces  -Wmissing-field-initializers 
@gol
 -Wmissing-include-dirs @gol
 -Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
 -Wredundant-decls  -Wno-return-local-addr @gol


Warn when returning the address of a temporary (middle-end) v2

2014-06-22 Thread Marc Glisse

Hello,

I followed the advice in this discussion:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00269.html

and here is a new patch. I made an effort to isolate a path in at least 
one subcase so it doesn't look too strange that the warning is in this 
file. Computing the dominance info just to tweak the warning message may 
be a bit excessive. I kept the same option as the front-ends, I don't know 
if we want a different one, or maybe a Wmaybe-... version. There will be 
cases where we get a duplicate warning from -Wtarget-lifetime in fortran, 
but none in the testsuite, and I would rather have 2 warnings than miss 
such broken code. The uninit-G testcase is about initialization, not 
returning, so I am changing that, even if it is unnecessary with the 
current version of the patch (only activated at -O2).


Bootstrap+testsuite (--enable-languages=all,obj-c++,ada,go) on 
x86_64-unknown-linux-gnu.


(by the way, contrib/compare_tests is confused when I use all languages, 
it prints comm: file 1 is not in sorted order and tons of spurious 
differences)


2014-06-23  Marc Glisse  marc.gli...@inria.fr

PR c++/60517
gcc/c/
* c-typeck.c (c_finish_return): Return 0 instead of the address of
a local variable.
gcc/cp/
* typeck.c (maybe_warn_about_returning_address_of_local): Return
whether it is returning the address of a local variable.
(check_return_expr): Return 0 instead of the address of a local
variable.
gcc/c-family/
* c.opt (-Wreturn-local-addr): Move to common.opt.
gcc/
* common.opt (-Wreturn-local-addr): Moved from c.opt.
* gimple-ssa-isolate-paths.c: Include diagnostic-core.h.
(isolate_path): New argument to avoid inserting a trap.
(find_implicit_erroneous_behaviour): Handle returning the address
of a local variable.
(find_explicit_erroneous_behaviour): Likewise.
(gimple_ssa_isolate_erroneous_paths): Calculate dominance info.
gcc/testsuite/
* c-c++-common/addrtmp.c: New file.
* c-c++-common/uninit-G.c: Adapt.

--
Marc GlisseIndex: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c(revision 211874)
+++ gcc/c/c-typeck.c(working copy)
@@ -9315,22 +9315,26 @@ c_finish_return (location_t loc, tree re
 
  if (DECL_P (inner)
   !DECL_EXTERNAL (inner)
   !TREE_STATIC (inner)
   DECL_CONTEXT (inner) == current_function_decl)
{
  if (TREE_CODE (inner) == LABEL_DECL)
warning_at (loc, OPT_Wreturn_local_addr,
function returns address of label);
  else
-   warning_at (loc, OPT_Wreturn_local_addr,
-   function returns address of local variable);
+   {
+ warning_at (loc, OPT_Wreturn_local_addr,
+ function returns address of local variable);
+ tree zero = build_zero_cst (TREE_TYPE (res));
+ t = build_compound_expr (loc, t, zero);
+   }
}
  break;
 
default:
  break;
}
 
  break;
}
 
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 211874)
+++ gcc/c-family/c.opt  (working copy)
@@ -682,24 +682,20 @@ ObjC ObjC++ Var(warn_protocol) Init(1) W
 Warn if inherited methods are unimplemented
 
 Wredundant-decls
 C ObjC C++ ObjC++ Var(warn_redundant_decls) Warning
 Warn about multiple declarations of the same object
 
 Wreorder
 C++ ObjC++ Var(warn_reorder) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn when the compiler reorders code
 
-Wreturn-local-addr
-C ObjC C++ ObjC++ Var(warn_return_local_addr) Init(1) Warning
-Warn about returning a pointer/reference to a local or temporary variable.
-
 Wreturn-type
 C ObjC C++ ObjC++ Var(warn_return_type) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
 Warn whenever a function's return type defaults to \int\ (C), or about 
inconsistent return types (C++)
 
 Wselector
 ObjC ObjC++ Var(warn_selector) Warning
 Warn if a selector has multiple methods
 
 Wshadow-ivar
 ObjC ObjC++ Var(warn_shadow_ivar) EnabledBy(Wshadow) Init(1) Warning
Index: gcc/common.opt
===
--- gcc/common.opt  (revision 211874)
+++ gcc/common.opt  (working copy)
@@ -596,20 +596,24 @@ Common Var(warn_packed) Warning
 Warn when the packed attribute has no effect on struct layout
 
 Wpadded
 Common Var(warn_padded) Warning
 Warn when padding is required to align structure members
 
 Wpedantic
 Common Var(pedantic) Warning
 Issue warnings needed for strict compliance to the standard
 
+Wreturn-local-addr
+Common Var(warn_return_local_addr) Init(1) Warning
+Warn about returning

Re: [i386] logical shift right in shrd

2014-06-21 Thread Marc Glisse

On Sat, 21 Jun 2014, Uros Bizjak wrote:


On Fri, Jun 20, 2014 at 10:42 PM, Marc Glisse marc.gli...@inria.fr wrote:


as reported in PR 61503, there seems to be a typo in the shrd pattern. I
think it is quite unlikely to cause any problem, because the pattern is 1
instruction too long for combine to recognize it (by the way, if someone has
suggestions for PR 55583...). But it is still better to fix it.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt
with lshiftrt.


OK for mainline and 4.9.


Thanks.

Er, I am sorry, I don't know what happened, but when testing the backport 
to 4.9 I got an obvious failure in the testsuite, which I am sure should 
also happen on trunk, but somehow I didn't see it (I am almost sure I 
tested the right branch though). Anyway, here is an updated patch, that 
did pass bootstrap+testsuite both on trunk and 4.9. I haven't committed 
anything yet, is the new patch ok?


2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd,
ix86_rotrdwi3_doubleword): Replace ashiftrt with lshiftrt.

--
Marc GlisseIndex: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md (revision 211865)
+++ gcc/config/i386/i386.md (working copy)
@@ -9601,37 +9601,37 @@
 (match_operand:DWI 1 register_operand)
 (match_operand:QI 2 nonmemory_operand)))
  (clobber (reg:CC FLAGS_REG))])
(match_dup 3)]
   TARGET_CMOVE
   [(const_int 0)]
   ix86_split_shift_insn (operands, operands[3], DWImode); DONE;)
 
 (define_insn x86_64_shrd
   [(set (match_operand:DI 0 nonimmediate_operand +r*m)
-(ior:DI (ashiftrt:DI (match_dup 0)
+(ior:DI (lshiftrt:DI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Jc))
(ashift:DI (match_operand:DI 1 register_operand r)
  (minus:QI (const_int 64) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   TARGET_64BIT
   shrd{q}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode DI)
(set_attr athlon_decode vector)
(set_attr amdfam10_decode vector)
(set_attr bdver1_decode vector)])
 
 (define_insn x86_shrd
   [(set (match_operand:SI 0 nonimmediate_operand +r*m)
-(ior:SI (ashiftrt:SI (match_dup 0)
+(ior:SI (lshiftrt:SI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Ic))
(ashift:SI (match_operand:SI 1 register_operand r)
  (minus:QI (const_int 32) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   
   shrd{l}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode SI)
(set_attr pent_pair np)
@@ -10069,27 +10069,27 @@
(rotatert:DWI (match_operand:DWI 1 register_operand 0)
   (match_operand:QI 2 shift_immediate_operand S)))
   (clobber (reg:CC FLAGS_REG))
   (clobber (match_scratch:DWIH 3 =r))]
  
  #
  reload_completed
  [(set (match_dup 3) (match_dup 4))
   (parallel
[(set (match_dup 4)
-(ior:DWIH (ashiftrt:DWIH (match_dup 4) (match_dup 2))
+(ior:DWIH (lshiftrt:DWIH (match_dup 4) (match_dup 2))
   (ashift:DWIH (match_dup 5)
(minus:QI (match_dup 6) (match_dup 2)
 (clobber (reg:CC FLAGS_REG))])
   (parallel
[(set (match_dup 5)
-(ior:DWIH (ashiftrt:DWIH (match_dup 5) (match_dup 2))
+(ior:DWIH (lshiftrt:DWIH (match_dup 5) (match_dup 2))
   (ashift:DWIH (match_dup 3)
(minus:QI (match_dup 6) (match_dup 2)
 (clobber (reg:CC FLAGS_REG))])]
 {
   operands[6] = GEN_INT (GET_MODE_BITSIZE (MODEmode));
 
   split_double_mode (DWImode, operands[0], 1, operands[4], operands[5]);
 })
 
 (define_insn *bmi2_rorxmode3_1


Re: [patch] change specific int128 - generic intN

2014-06-21 Thread Marc Glisse

(Adding libstdc++@ in Cc: so they see the patch at
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01655.html )

On Sat, 21 Jun 2014, DJ Delorie wrote:


New version of https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00723.html

This is a big patch, but it includes all the features/changes/support
requested since the initial patch.  Tested with identical results
before/after on x86-64 EXCEPT for g++.dg/ext/int128-1.C, which assumes
__int128 is not supported with -std= but as per previous discussion,
it now is (because __intN might be used for size_t, which is
required).  Tested on msp430 with significant improvements but a few
regressions (expected).

This patch replaces the current int128 support with a generic intN
support, which happens to provide int128 as well as up to three
additional intN types per target.  I will post a separate patch for
the msp430 backend demonstrating the use of this new feature for
__int20, as well as a third to fix some problems with PSImode in
general.

The general idea is that genmodes has a new macro INT_N() for
tm-modes.def, which gives the mode and precision to use for
target-specific intN types.  These types are always created, but the
parser will error with unsupported if the target does not support
that mode for that compile (i.e. because of command line switches,
etc).  There's an INT_N(TI,128) in the common sources.

If the target defines any types larger than long long, those types
(like int128 before) will be used for type promotion, but types
smaller than long long will not, to avoid confusing the C type
promotion rules.  Otherwise, it's up to the source being compiled to
specific __intN types as desired.


Nice. A couple quick comments on the parts I understand while
maintainers are resting for the week-end.


Index: libstdc++-v3/src/c++11/limits.cc
===

[...]

+#if !defined(__STRICT_ANSI__)


Since the test on __STRICT_ANSI__ is removed for all other uses, it would 
seem consistent to me to remove this one as well. Besides, you are already 
testing __GLIBCXX_USE_INT_N_0, which as far as I understand is protected 
by !flag_iso (with the exception of size_t).



-#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
+  // Conditionalizing on __STRICT_ANSI__ here will break any port that
+  // uses one of these types for size_t.
+#if defined(__GLIBCXX_USE_INT_N_0)
  template
-struct __is_integral_helper__int128
+struct __is_integral_helper__GLIBCXX_TYPE_INT_N_0


Since the check for __STRICT_ANSI__ is removed, do we need to add
__extension__ in front of __GLIBCXX_TYPE_INT_N_0 to avoid warning with
-Wsystem-headers?


--- gcc/cp/rtti.c   (revision 211858)
+++ gcc/cp/rtti.c   (working copy)
@@ -1506,31 +1506,44 @@ emit_support_tinfo_1 (tree bltn)

void
emit_support_tinfos (void)
{
  /* Dummy static variable so we can put nullptr in the array; it will be
 set before we actually start to walk the array.  */
-  static tree *const fundamentals[] =
+  static tree * fundamentals[] =
  {
void_type_node,
boolean_type_node,
wchar_type_node, char16_type_node, char32_type_node,
char_type_node, signed_char_type_node, unsigned_char_type_node,
short_integer_type_node, short_unsigned_type_node,
integer_type_node, unsigned_type_node,
long_integer_type_node, long_unsigned_type_node,
long_long_integer_type_node, long_long_unsigned_type_node,
-int128_integer_type_node, int128_unsigned_type_node,
float_type_node, double_type_node, long_double_type_node,
dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
+#define FUND_INT_N_IDX 22
+// These eight are for intN_t nodes
+nullptr_type_node, nullptr_type_node, nullptr_type_node, 
nullptr_type_node,
+nullptr_type_node, nullptr_type_node, nullptr_type_node, 
nullptr_type_node,
nullptr_type_node,
0
  };
-  int ix;
+  int ix, i;
  tree bltn_type, dtor;

+  ix = FUND_INT_N_IDX;
+  for (i = 0; i  NUM_INT_N_ENTS; i ++)
+if (int_n_enabled_p[i])
+  {
+   fundamentals [ix++] = int_n_trees[i].signed_type;
+   fundamentals [ix++] = int_n_trees[i].unsigned_type;
+  }
+  fundamentals [ix++] = nullptr_type_node;
+  fundamentals [ix++] = 0;
+
  push_abi_namespace ();
  bltn_type = xref_tag (class_type,
get_identifier (__fundamental_type_info),
/*tag_scope=*/ts_current, false);
  pop_abi_namespace ();
  if (!COMPLETE_TYPE_P (bltn_type))


That seems complicated. You just need to call emit_support_tinfo_1 on
each of the types (see how fundamentals is used at the end of the
function), no need to put everything in the array.

--
Marc Glisse


[RTL] (vec_select (vec_concat a b) c) may be just a or b

2014-06-21 Thread Marc Glisse

Hello,

this is another small simplification of RTL for vectors. Note that it 
doesn't really solve the problem, because these simplifications are only 
performed for single-use objects. If I start from vectors [a,b] and [c,d] 
and concatenate them into [a,b,c,d], then extract both halves, as in the 
original testcase in the PR, we won't notice that those are the original 
vectors. Still, better than nothing...


(we output a vzeroupper for the testcase, that seems unnecessary)

Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-22  Marc Glisse  marc.gli...@inria.fr

PR target/44551
gcc/
* simplify-rtx.c (simplify_binary_operation_1) VEC_SELECT:
Optimize inverse of a VEC_CONCAT.
gcc/testsuite/
* gcc.target/i386/pr44551-1.c: New file.

--
Marc GlisseIndex: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 211867)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3359,20 +3359,64 @@ simplify_binary_operation_1 (enum rtx_co
  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
  rtx subop0, subop1;
 
  gcc_assert (i0  2  i1  2);
  subop0 = XEXP (trueop0, i0);
  subop1 = XEXP (trueop0, i1);
 
  return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1);
}
+
+ /* If we select one half of a vec_concat, return that.  */
+ if (GET_CODE (trueop0) == VEC_CONCAT
+  CONST_INT_P (XVECEXP (trueop1, 0, 0)))
+   {
+ rtx subop0 = XEXP (trueop0, 0);
+ rtx subop1 = XEXP (trueop0, 1);
+ enum machine_mode mode0 = GET_MODE (subop0);
+ enum machine_mode mode1 = GET_MODE (subop1);
+ int li = GET_MODE_SIZE (GET_MODE_INNER (mode0));
+ int l0 = GET_MODE_SIZE (mode0) / li;
+ int l1 = GET_MODE_SIZE (mode1) / li;
+ int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
+ if (i0 == 0  !side_effects_p (op1)  mode == mode0)
+   {
+ bool success = true;
+ for (int i = 1; i  l0; ++i)
+   {
+ rtx j = XVECEXP (trueop1, 0, i);
+ if (!CONST_INT_P (j) || INTVAL (j) != i)
+   {
+ success = false;
+ break;
+   }
+   }
+ if (success)
+   return subop0;
+   }
+ if (i0 == l0  !side_effects_p (op0)  mode == mode1)
+   {
+ bool success = true;
+ for (int i = 1; i  l1; ++i)
+   {
+ rtx j = XVECEXP (trueop1, 0, i);
+ if (!CONST_INT_P (j) || INTVAL (j) != i0 + i)
+   {
+ success = false;
+ break;
+   }
+   }
+ if (success)
+   return subop1;
+   }
+   }
}
 
   if (XVECLEN (trueop1, 0) == 1
   CONST_INT_P (XVECEXP (trueop1, 0, 0))
   GET_CODE (trueop0) == VEC_CONCAT)
{
  rtx vec = trueop0;
  int offset = INTVAL (XVECEXP (trueop1, 0, 0)) * GET_MODE_SIZE (mode);
 
  /* Try to find the element in the VEC_CONCAT.  */
Index: gcc/testsuite/gcc.target/i386/pr44551-1.c
===
--- gcc/testsuite/gcc.target/i386/pr44551-1.c   (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr44551-1.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mavx } */
+
+#include immintrin.h
+
+__m128i
+foo (__m256i x, __m128i y)
+{
+  __m256i r = _mm256_insertf128_si256(x, y, 1);
+  __m128i a = _mm256_extractf128_si256(r, 1);
+  return a;
+}
+
+/* { dg-final { scan-assembler-not vinsertf } } */
+/* { dg-final { scan-assembler-not vextractf } } */


[i386] logical shift right in shrd

2014-06-20 Thread Marc Glisse

Hello,

as reported in PR 61503, there seems to be a typo in the shrd pattern. I 
think it is quite unlikely to cause any problem, because the pattern is 1 
instruction too long for combine to recognize it (by the way, if someone 
has suggestions for PR 55583...). But it is still better to fix it.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt
with lshiftrt.

--
Marc GlisseIndex: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md (revision 211856)
+++ gcc/config/i386/i386.md (working copy)
@@ -9601,37 +9601,37 @@
 (match_operand:DWI 1 register_operand)
 (match_operand:QI 2 nonmemory_operand)))
  (clobber (reg:CC FLAGS_REG))])
(match_dup 3)]
   TARGET_CMOVE
   [(const_int 0)]
   ix86_split_shift_insn (operands, operands[3], DWImode); DONE;)
 
 (define_insn x86_64_shrd
   [(set (match_operand:DI 0 nonimmediate_operand +r*m)
-(ior:DI (ashiftrt:DI (match_dup 0)
+(ior:DI (lshiftrt:DI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Jc))
(ashift:DI (match_operand:DI 1 register_operand r)
  (minus:QI (const_int 64) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   TARGET_64BIT
   shrd{q}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode DI)
(set_attr athlon_decode vector)
(set_attr amdfam10_decode vector)
(set_attr bdver1_decode vector)])
 
 (define_insn x86_shrd
   [(set (match_operand:SI 0 nonimmediate_operand +r*m)
-(ior:SI (ashiftrt:SI (match_dup 0)
+(ior:SI (lshiftrt:SI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Ic))
(ashift:SI (match_operand:SI 1 register_operand r)
  (minus:QI (const_int 32) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   
   shrd{l}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode SI)
(set_attr pent_pair np)


Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)

2014-06-19 Thread Marc Glisse

On Thu, 19 Jun 2014, Jason Merrill wrote:

We were treating a const  member function like a normal const reference, and 
binding an rvalue object argument to it.  But it doesn't work that way.


That looks weird to me. The const version is a better match than the 
const, so we should pick that one in overload resolution, but if we 
remove the const version, the other one seems valid to me, we shouldn't 
reject this program:


struct Type
{
  void get() const { }
};

int main()
{
  Type{}.get();
}

At least both clang and intel accept it, and I hope the standard didn't 
make it behave differently from regular function calls.


--
Marc Glisse


Re: [RTL, i386] Use subreg instead of UNSPEC_CAST

2014-06-10 Thread Marc Glisse

On Tue, 19 Mar 2013, Richard Henderson wrote:


I'm not fond of this, primarily because I believe the pattern should
not exist at all.


One year later, new try. Tweaking the pattern, I ended up with a copy of 
the mov pattern (the subreg is generated automatically when the modes 
don't match), so I just removed it. I know the comment in emit-rtl.c says 
splitters are a better way forward than subregs, but I haven't managed 
with splitters while the subreg patch is very simple :-) I added a -O0 
testcase because when I was experimenting I had many versions that worked 
for -O2 but ICEd at -O0 (and vice versa), but it might be redundant with 
some other tests.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-10  Marc Glisse  marc.gli...@inria.fr

PR target/50829
gcc/
* config/i386/sse.md (enum unspec): Remove UNSPEC_CAST.
(avx_castmodeavxsizesuffix_castmode): Remove.
* config/i386/i386.c (builtin_description) [__builtin_ia32_si256_si,
__builtin_ia32_ps256_ps, __builtin_ia32_pd256_pd]: Replace the
removed insn with mov.
* emit-rtl.c (validate_subreg): Allow vector-vector subregs.

gcc/testsuite/
* gcc.target/i386/pr50829-1.c: New file.
* gcc.target/i386/pr50829-2.c: New file.

--
Marc GlisseIndex: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 211397)
+++ gcc/config/i386/i386.c  (working copy)
@@ -29793,23 +29793,23 @@ static const struct builtin_description
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_roundps_sfix256, 
__builtin_ia32_ceilps_sfix256, IX86_BUILTIN_CEILPS_SFIX256, (enum rtx_code) 
ROUND_CEIL, (int) V8SI_FTYPE_V8SF_ROUND },
 
   { OPTION_MASK_ISA_AVX, CODE_FOR_roundv8sf2, __builtin_ia32_roundps_az256, 
IX86_BUILTIN_ROUNDPS_AZ256, UNKNOWN, (int) V8SF_FTYPE_V8SF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_roundv8sf2_sfix, 
__builtin_ia32_roundps_az_sfix256, IX86_BUILTIN_ROUNDPS_AZ_SFIX256, UNKNOWN, 
(int) V8SI_FTYPE_V8SF },
 
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpckhpd256,  
__builtin_ia32_unpckhpd256, IX86_BUILTIN_UNPCKHPD256, UNKNOWN, (int) 
V4DF_FTYPE_V4DF_V4DF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpcklpd256,  
__builtin_ia32_unpcklpd256, IX86_BUILTIN_UNPCKLPD256, UNKNOWN, (int) 
V4DF_FTYPE_V4DF_V4DF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpckhps256,  
__builtin_ia32_unpckhps256, IX86_BUILTIN_UNPCKHPS256, UNKNOWN, (int) 
V8SF_FTYPE_V8SF_V8SF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpcklps256,  
__builtin_ia32_unpcklps256, IX86_BUILTIN_UNPCKLPS256, UNKNOWN, (int) 
V8SF_FTYPE_V8SF_V8SF },
 
-  { OPTION_MASK_ISA_AVX, CODE_FOR_avx_si256_si, __builtin_ia32_si256_si, 
IX86_BUILTIN_SI256_SI, UNKNOWN, (int) V8SI_FTYPE_V4SI },
-  { OPTION_MASK_ISA_AVX, CODE_FOR_avx_ps256_ps, __builtin_ia32_ps256_ps, 
IX86_BUILTIN_PS256_PS, UNKNOWN, (int) V8SF_FTYPE_V4SF },
-  { OPTION_MASK_ISA_AVX, CODE_FOR_avx_pd256_pd, __builtin_ia32_pd256_pd, 
IX86_BUILTIN_PD256_PD, UNKNOWN, (int) V4DF_FTYPE_V2DF },
+  { OPTION_MASK_ISA_AVX, CODE_FOR_movv8si, __builtin_ia32_si256_si, 
IX86_BUILTIN_SI256_SI, UNKNOWN, (int) V8SI_FTYPE_V4SI },
+  { OPTION_MASK_ISA_AVX, CODE_FOR_movv8sf, __builtin_ia32_ps256_ps, 
IX86_BUILTIN_PS256_PS, UNKNOWN, (int) V8SF_FTYPE_V4SF },
+  { OPTION_MASK_ISA_AVX, CODE_FOR_movv4df, __builtin_ia32_pd256_pd, 
IX86_BUILTIN_PD256_PD, UNKNOWN, (int) V4DF_FTYPE_V2DF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8si, 
__builtin_ia32_si_si256, IX86_BUILTIN_SI_SI256, UNKNOWN, (int) 
V4SI_FTYPE_V8SI },
   { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8sf, 
__builtin_ia32_ps_ps256, IX86_BUILTIN_PS_PS256, UNKNOWN, (int) 
V4SF_FTYPE_V8SF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v4df, 
__builtin_ia32_pd_pd256, IX86_BUILTIN_PD_PD256, UNKNOWN, (int) 
V2DF_FTYPE_V4DF },
 
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestzpd, 
IX86_BUILTIN_VTESTZPD, EQ, (int) INT_FTYPE_V2DF_V2DF_PTEST },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestcpd, 
IX86_BUILTIN_VTESTCPD, LTU, (int) INT_FTYPE_V2DF_V2DF_PTEST },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestnzcpd, 
IX86_BUILTIN_VTESTNZCPD, GTU, (int) INT_FTYPE_V2DF_V2DF_PTEST },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestzps, 
IX86_BUILTIN_VTESTZPS, EQ, (int) INT_FTYPE_V4SF_V4SF_PTEST },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestcps, 
IX86_BUILTIN_VTESTCPS, LTU, (int) INT_FTYPE_V4SF_V4SF_PTEST },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestnzcps, 
IX86_BUILTIN_VTESTNZCPS, GTU, (int) INT_FTYPE_V4SF_V4SF_PTEST },
Index: gcc/config/i386/sse.md
===
--- gcc/config/i386/sse.md  (revision 211397)
+++ gcc/config/i386/sse.md  (working copy)
@@ -66,21 +66,20 @@
   UNSPEC_AESKEYGENASSIST
 
   ;; For PCLMUL support
   UNSPEC_PCLMUL
 
   ;; For AVX support
   UNSPEC_PCMP
   UNSPEC_VPERMIL

PR54442 build_qualified_type produces a non-canonical type

2014-06-09 Thread Marc Glisse

Hello,

in this PR, we end up with 3 types A, B and C such that 
TYPE_CANONICAL(A)=B and TYPE_CANONICAL(B)=C. I don't think that is 
supposed to happen. Here, build_qualified_type fills in TYPE_CANONICAL(A) 
by calling build_qualified_type (so get_qualified_type), which afaics 
isn't guaranteed to return a canonical type.


I doubt the patch can be wrong, but it may be that this is a situation 
that is not supposed to happen and should be fixed elsewhere?


Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-09  Marc Glisse  marc.gli...@inria.fr

PR c++/54442
gcc/
* tree.c (build_qualified_type): Use a canonical type for
TYPE_CANONICAL.
gcc/testsuite/
* g++.dg/pr54442.C: New file.

--
Marc GlisseIndex: gcc/testsuite/g++.dg/pr54442.C
===
--- gcc/testsuite/g++.dg/pr54442.C  (revision 0)
+++ gcc/testsuite/g++.dg/pr54442.C  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+struct S
+{
+  void s (int) const throw ();
+  void s (int) throw ();
+};
+
+typedef int index_t;
+
+void (S::*f) (index_t)   = S::s;
+void (S::*g) (index_t) const = S::s;
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 211374)
+++ gcc/tree.c  (working copy)
@@ -6319,22 +6319,25 @@ build_qualified_type (tree type, int typ
TYPE_ALIGN (t) = TYPE_ALIGN (atomic_type);
}
}
 
   if (TYPE_STRUCTURAL_EQUALITY_P (type))
/* Propagate structural equality. */
SET_TYPE_STRUCTURAL_EQUALITY (t);
   else if (TYPE_CANONICAL (type) != type)
/* Build the underlying canonical type, since it is different
   from TYPE. */
-   TYPE_CANONICAL (t) = build_qualified_type (TYPE_CANONICAL (type),
-  type_quals);
+   {
+ tree c = build_qualified_type (TYPE_CANONICAL (type),
+type_quals);
+ TYPE_CANONICAL (t) = TYPE_CANONICAL (c);
+   }
   else
/* T is its own canonical type. */
TYPE_CANONICAL (t) = t;
 
 }
 
   return t;
 }
 
 /* Create a variant of type T with alignment ALIGN.  */


Re: PR54442 build_qualified_type produces a non-canonical type

2014-06-09 Thread Marc Glisse

On Mon, 9 Jun 2014, Jason Merrill wrote:


On 06/09/2014 10:18 AM, Marc Glisse wrote:

I doubt the patch can be wrong, but it may be that this is a situation
that is not supposed to happen and should be fixed elsewhere?


Seems likely.  What is the difference between the type returned from 
build_qualified_type (TYPE_CANONICAL and it's TYPE_CANONICAL?  I would expect 
them to be the same.


throws tree_list 0x7660e5c8
purpose integer_cst 0x764d6ba0 constant 1

(in what build_qualified_type returns)

--
Marc Glisse


Re: __float128 typeinfo

2014-06-07 Thread Marc Glisse

On Fri, 6 Jun 2014, Paolo Carlini wrote:


On 06/06/2014 04:16 PM, Marc Glisse wrote:

abi_check is broken before my patch (134 incompatible symbols).
Isn't broken for me, though. Likewise, AFAICS, on gcc-testresults. I would 
recommend investigating in some detail what's going on at your end...


Ah, no, abi_check actually passed in the regular bootstrap, it is in the 
debug builds (-O0) that it fails because more than a hundred non-inlined 
inline functions find their way into the library. False alarm.


--
Marc Glisse


__float128 typeinfo

2014-06-06 Thread Marc Glisse

Hello,

here is a new try on adding __float128 typeinfo to libsupc++. The 
front-end part is based on the discussion with Jason yesterday. The 
libstdc++ part is copied from:

https://gcc.gnu.org/ml/libstdc++/2014-04/msg00077.html
(which wasn't reviewed), but I changed the testsuite.

Michael will likely need to make some adjustments to his __float128 patch, 
but I believe it will only be a small configure tweak.


Ramana, does it look safe for ARM? By the way, do you want to remove the 
XFmode defined in arm-modes.def and apparently unused?


Bootstrap+testsuite on x86_64-linux-gnu. I manually checked that 
libstdc++.{a,so} gained exactly the symbols related to typeinfo for 
__float128.


abi_check is broken before my patch (134 incompatible symbols). After it, 
the number of added symbols increases by 7, but the number of incompatible 
symbols remains the same, so I take it as a pass.


2014-06-06  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/cp/
* rtti.c (emit_support_tinfos): Handle __float128.
libstdc++-v3/
* config/abi/pre/float128.ver: New file.
* configure.ac: Use float128.ver when relevant.
* configure: Regenerate.
* testsuite/util/testsuite_abi.cc (check_version): Accept new
CXXABI_FLOAT128_1.3.9 version.

--
Marc GlisseIndex: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 211311)
+++ gcc/cp/rtti.c   (working copy)
@@ -1540,20 +1540,31 @@ emit_support_tinfos (void)
/*tag_scope=*/ts_current, false);
   pop_abi_namespace ();
   if (!COMPLETE_TYPE_P (bltn_type))
 return;
   dtor = CLASSTYPE_DESTRUCTORS (bltn_type);
   if (!dtor || DECL_EXTERNAL (dtor))
 return;
   doing_runtime = 1;
   for (ix = 0; fundamentals[ix]; ix++)
 emit_support_tinfo_1 (*fundamentals[ix]);
+
+  /* Search for an extra floating point type like __float128.  */
+  for (enum machine_mode mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT);
+   mode != VOIDmode;
+   mode = GET_MODE_WIDER_MODE (mode))
+{
+  tree type = c_common_type_for_mode (mode, false);
+  if (type  type != float_type_node  type != double_type_node
+  type != long_double_type_node)
+   emit_support_tinfo_1 (type);
+}
 }
 
 /* Finish a type info decl. DECL_PTR is a pointer to an unemitted
tinfo decl.  Determine whether it needs emitting, and if so
generate the initializer.  */
 
 bool
 emit_tinfo_decl (tree decl)
 {
   tree type = TREE_TYPE (DECL_NAME (decl));
Index: libstdc++-v3/config/abi/pre/float128.ver
===
--- libstdc++-v3/config/abi/pre/float128.ver(revision 0)
+++ libstdc++-v3/config/abi/pre/float128.ver(working copy)
@@ -0,0 +1,10 @@
+# Appended to version file.
+
+CXXABI_FLOAT128_1.3.9 {
+
+# typeinfo and typeinfo name for __float128
+_ZT[IS]g;
+_ZT[IS]Pg;
+_ZT[IS]PKg;
+
+};
Index: libstdc++-v3/configure
===
--- libstdc++-v3/configure  (revision 211311)
+++ libstdc++-v3/configure  (working copy)
@@ -15698,20 +15698,23 @@ $as_echo #define _GLIBCXX_USE_FLOAT128
 $as_echo $enable_float128 6; }
 rm -f conftest*
 
   ac_ext=c
 ac_cpp='$CPP $CPPFLAGS'
 ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext 5'
 ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext 
$LIBS 5'
 ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
+if test $enable_float128 = yes; then
+  port_specific_symbol_files=$port_specific_symbol_files 
\$(top_srcdir)/config/abi/pre/float128.ver
+fi
 
 # Checks for compiler support that doesn't require linking.
 
   # All these tests are for C++; save the language and the compiler flags.
   # The CXXFLAGS thing is suspicious, but based on similar bits previously
   # found in GLIBCXX_CONFIGURE.
 
   ac_ext=cpp
 ac_cpp='$CXXCPP $CPPFLAGS'
 ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext 5'
Index: libstdc++-v3/configure.ac
===
--- libstdc++-v3/configure.ac   (revision 211311)
+++ libstdc++-v3/configure.ac   (working copy)
@@ -146,20 +146,23 @@ GLIBCXX_ENABLE_HOSTED
 # Enable descriptive messages to standard output on termination.
 GLIBCXX_ENABLE_VERBOSE
 
 # Enable compiler support that doesn't require linking.
 GLIBCXX_ENABLE_SJLJ_EXCEPTIONS
 GLIBCXX_ENABLE_PCH($is_hosted)
 GLIBCXX_ENABLE_THREADS
 GLIBCXX_ENABLE_ATOMIC_BUILTINS
 GLIBCXX_ENABLE_DECIMAL_FLOAT
 GLIBCXX_ENABLE_INT128_FLOAT128
+if test $enable_float128 = yes; then
+  port_specific_symbol_files=$port_specific_symbol_files 
\$(top_srcdir)/config/abi/pre/float128.ver
+fi
 
 # Checks for compiler support that doesn't require linking.
 GLIBCXX_CHECK_COMPILER_FEATURES
 
 # Enable all the variable C++ runtime options that don't require linking.
 GLIBCXX_ENABLE_CSTDIO
 GLIBCXX_ENABLE_CLOCALE
 GLIBCXX_ENABLE_ALLOCATOR

Re: __float128 typeinfo

2014-06-06 Thread Marc Glisse

On Fri, 6 Jun 2014, Ramana Radhakrishnan wrote:


On Fri, Jun 6, 2014 at 3:16 PM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

here is a new try on adding __float128 typeinfo to libsupc++. The front-end
part is based on the discussion with Jason yesterday. The libstdc++ part is
copied from:
https://gcc.gnu.org/ml/libstdc++/2014-04/msg00077.html
(which wasn't reviewed), but I changed the testsuite.

Michael will likely need to make some adjustments to his __float128 patch,
but I believe it will only be a small configure tweak.

Ramana, does it look safe for ARM? By the way, do you want to remove the
XFmode defined in arm-modes.def and apparently unused?


Thanks for this though I know Tejas is actually trying to fix AArch64
first and then we'll move on to ARM for cleaning up all these types -
as you realize it's quite a chunky project.


Nice to know. I am a bit afraid it might be hard without breaking ABI 
compatibility at least a little. Good luck anyway.



A patch to remove XFmode is pre-approved.


Even for such a trivial patch, it would be good to do at least a build to 
check nothing breaks, but I don't have an easy way to do that (debian 
multiarch cross-builds don't work out of the box), it would be better if 
one of the people working on ARM could handle it.



I'll try and have a look at this patch later today or over the weekend
to see if it doesn't affect ARM / AArch64.


Thanks!

--
Marc Glisse


Re: __float128 typeinfo

2014-06-06 Thread Marc Glisse

On Fri, 6 Jun 2014, Jason Merrill wrote:


On 06/06/2014 10:16 AM, Marc Glisse wrote:

 * config/abi/pre/float128.ver: New file.
+CXXABI_FLOAT128_1.3.9 {


What's your rationale for keeping this in a separate version block rather 
than in 1.3.9 (like __int128)?


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43622#c6

Powerpc already has those symbols in CXXABI_LDBL_1.3 (for a type that 
isn't __float128 but (de)mangles that way). That doesn't prevent from 
using CXXABI_1.3.9 in the x86/ia64-specific file though, so I can do that 
if you prefer. If a new target implements __float128 in 4.11 or 4.12, they 
will have to refine the configure test and provide a different .ver file 
to avoid adding symbols to an old version. Or I could try to share the 
CXXABI_LDBL_1.3 name although it doesn't correspond to long double in this 
case.


I have no preference, I'll go with whatever people like.

--
Marc Glisse


Re: __float128 typeinfo

2014-06-06 Thread Marc Glisse

On Fri, 6 Jun 2014, Jason Merrill wrote:


On 06/06/2014 11:58 AM, Marc Glisse wrote:

On Fri, 6 Jun 2014, Jason Merrill wrote:


What's your rationale for keeping this in a separate version block
rather than in 1.3.9 (like __int128)?


Powerpc already has those symbols in CXXABI_LDBL_1.3 (for a type that
isn't __float128 but (de)mangles that way). That doesn't prevent from
using CXXABI_1.3.9 in the x86/ia64-specific file though, so I can do
that if you prefer. If a new target implements __float128 in 4.11 or
4.12, they will have to refine the configure test and provide a
different .ver file to avoid adding symbols to an old version.


Fair enough.  Let's stick with your approach then, but drop the 1.3.9 from 
the name.


Ok. To help with the review, I am appending the updated patch (untested). 
(if we add numeric_limits__float128 later, I think it should go into 
GLIBCXX_FLOAT128_1)


Why is __float128 handled as a target type, anyway?  I'd think it ought to be 
a standard type that just isn't supported on all targets, like __int128.


I don't know. Since it is implemented in software on all platforms as far 
as I know, it shouldn't be that specific to a target. But it isn't just 
__float128, on some targets my patch may start generating extra symbols 
for half-floats (ARM) or __float80.


DJ Delorie's work on __intN may be a good direction for __floatN as well, 
IIUC we will have a global list of 4 __intN types, and the target decides 
the value of N for each of them, so we can refer to those type nodes in 
common code but they are still target-specific. Or maybe there isn't 
enough variety in float types to deserve this.


--
Marc GlisseIndex: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 211311)
+++ gcc/cp/rtti.c   (working copy)
@@ -1540,20 +1540,31 @@ emit_support_tinfos (void)
/*tag_scope=*/ts_current, false);
   pop_abi_namespace ();
   if (!COMPLETE_TYPE_P (bltn_type))
 return;
   dtor = CLASSTYPE_DESTRUCTORS (bltn_type);
   if (!dtor || DECL_EXTERNAL (dtor))
 return;
   doing_runtime = 1;
   for (ix = 0; fundamentals[ix]; ix++)
 emit_support_tinfo_1 (*fundamentals[ix]);
+
+  /* Search for an extra floating point type like __float128.  */
+  for (enum machine_mode mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT);
+   mode != VOIDmode;
+   mode = GET_MODE_WIDER_MODE (mode))
+{
+  tree type = c_common_type_for_mode (mode, false);
+  if (type  type != float_type_node  type != double_type_node
+  type != long_double_type_node)
+   emit_support_tinfo_1 (type);
+}
 }
 
 /* Finish a type info decl. DECL_PTR is a pointer to an unemitted
tinfo decl.  Determine whether it needs emitting, and if so
generate the initializer.  */
 
 bool
 emit_tinfo_decl (tree decl)
 {
   tree type = TREE_TYPE (DECL_NAME (decl));
Index: libstdc++-v3/config/abi/pre/float128.ver
===
--- libstdc++-v3/config/abi/pre/float128.ver(revision 0)
+++ libstdc++-v3/config/abi/pre/float128.ver(working copy)
@@ -0,0 +1,10 @@
+# Appended to version file.
+
+CXXABI_FLOAT128 {
+
+# typeinfo and typeinfo name for __float128
+_ZT[IS]g;
+_ZT[IS]Pg;
+_ZT[IS]PKg;
+
+};
Index: libstdc++-v3/configure
===
--- libstdc++-v3/configure  (revision 211311)
+++ libstdc++-v3/configure  (working copy)
@@ -15698,20 +15698,23 @@ $as_echo #define _GLIBCXX_USE_FLOAT128
 $as_echo $enable_float128 6; }
 rm -f conftest*
 
   ac_ext=c
 ac_cpp='$CPP $CPPFLAGS'
 ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext 5'
 ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext 
$LIBS 5'
 ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
+if test $enable_float128 = yes; then
+  port_specific_symbol_files=$port_specific_symbol_files 
\$(top_srcdir)/config/abi/pre/float128.ver
+fi
 
 # Checks for compiler support that doesn't require linking.
 
   # All these tests are for C++; save the language and the compiler flags.
   # The CXXFLAGS thing is suspicious, but based on similar bits previously
   # found in GLIBCXX_CONFIGURE.
 
   ac_ext=cpp
 ac_cpp='$CXXCPP $CPPFLAGS'
 ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext 5'
Index: libstdc++-v3/configure.ac
===
--- libstdc++-v3/configure.ac   (revision 211311)
+++ libstdc++-v3/configure.ac   (working copy)
@@ -146,20 +146,23 @@ GLIBCXX_ENABLE_HOSTED
 # Enable descriptive messages to standard output on termination.
 GLIBCXX_ENABLE_VERBOSE
 
 # Enable compiler support that doesn't require linking.
 GLIBCXX_ENABLE_SJLJ_EXCEPTIONS
 GLIBCXX_ENABLE_PCH($is_hosted)
 GLIBCXX_ENABLE_THREADS
 GLIBCXX_ENABLE_ATOMIC_BUILTINS
 GLIBCXX_ENABLE_DECIMAL_FLOAT
 GLIBCXX_ENABLE_INT128_FLOAT128
+if test $enable_float128 = yes

Re: PR61385: phiopt drops some PHIs

2014-06-04 Thread Marc Glisse

On Tue, 3 Jun 2014, Jeff Law wrote:


On 06/03/14 08:08, Richard Biener wrote:

All arguments get the same value (and the PHI in middle-bb is surely
a singleton?), so it's way better to re-materialize the PHI as a
gimple assignment at the start of the basic block.  If they are
singletons (which I expect), the easiest way is to propagate their
single argument into all uses and simply remove them.


I was all for propagating, but replace_uses_by calls fold_stmt on the 
using statements (that's normal). If I first move the statement and then 
replace, I may call fold_stmt on a stmt that isn't dominated by the defs 
of all its arguments. If I first replace, the statement I am supposed to 
move may have been modified so I need to find out what happened to it. In 
the end, the assignment seems easier and safer (and not too bad since this 
case almost never happens in practice).



We certainly want to get rid of them :-)

I'd start by finding out which pass left the degenerate, ensure it's not 
marked as SSA_NAME_OCCURS_IN_ABNORMAL_PHI, then propagate it away.


If it's a systemic problem that a particular pass can leave degenerate PHIs, 
then you might want to schedule the phi-only propagator to run after that 
pass.  It does const/copy propagation seeding from degenerate PHIs, so it 
ought to be very fast.


This one comes from VRP2 apparently. But it isn't the only pass that
produces singleton phis. I didn't look at them closely, but I assume LIM 
is normal, DOM maybe less so, etc.


bootstrap+testsuite on x86_64-linux-gnu as usual.

2014-06-04  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/61385
gcc/
* tree-ssa-phiopt.c (value_replacement): Replace singleton PHIs
with assignments.
gcc/testsuite/
* gcc.dg/tree-ssa/pr61385.c: New file.


--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#define assert(x) if (!(x)) __builtin_abort ()
+
+int a, b, c, d, e, f, g;
+
+int
+fn1 ()
+{
+  int *h = c;
+  for (; c  1; c++)
+{
+  int *i = a, *k = a;
+  f = 0;
+  if (b)
+   return 0;
+  if (*h)
+   {
+ int **j = i;
+ *j = 0;
+ d = 0;
+   }
+  else
+   g = e = 0;
+  if (*h)
+   {
+ int **l = k;
+ *l = g;
+   }
+  d = *h;
+  assert (k == a || k);
+  assert (i);
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 (); 
+  return 0;
+}
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 211178)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -877,21 +877,35 @@ value_replacement (basic_block cond_bb,
operand_equal_for_phi_arg_p (rhs2, cond_lhs)
neutral_element_p (code_def, cond_rhs, true))
  || (arg1 == rhs2
   operand_equal_for_phi_arg_p (rhs1, cond_lhs)
   neutral_element_p (code_def, cond_rhs, false))
  || (operand_equal_for_phi_arg_p (arg1, cond_rhs)
   (operand_equal_for_phi_arg_p (rhs2, cond_lhs)
  || operand_equal_for_phi_arg_p (rhs1, cond_lhs))
   absorbing_element_p (code_def, cond_rhs
 {
+  /* Any PHI node is a singleton and we can replace it with an assignment.
+We don't just call replace_uses_by because it calls fold_stmt and it
+becomes hard to make all the right adjustments in the right order.  */
   gsi = gsi_for_stmt (cond);
+  gimple_stmt_iterator psi = gsi_start_phis (middle_bb);
+  while (!gsi_end_p (psi))
+   {
+ gimple phi_moving = gsi_stmt (psi);
+ tree phi_res = gimple_phi_result (phi_moving);
+ tree phi_arg = gimple_phi_arg_def (phi_moving, 0);
+ gsi_remove (psi, false);
+ gsi_insert_before (gsi, gimple_build_assign (phi_res, phi_arg),
+GSI_SAME_STMT);
+   }
+
   gimple_stmt_iterator gsi_from = gsi_for_stmt (assign);
   gsi_move_before (gsi_from, gsi);
   replace_phi_edge_with_variable (cond_bb, e1, phi, lhs);
   return 2;
 }
 
   return 0;
 }
 
 /*  The function minmax_replacement does the main work of doing the minmax


Re: PR61385: phiopt drops some PHIs

2014-06-04 Thread Marc Glisse

On Wed, 4 Jun 2014, Richard Biener wrote:


So I'd say we should instead simply bail out if the middle-bb has a PHI node.


Sounds good to me, so I am testing the mini-patch I had originally posted 
to bugzilla:


2014-06-04  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/61385
gcc/
* tree-ssa-phiopt.c (value_replacement): Punt if there are PHI nodes.
gcc/testsuite/
* gcc.dg/tree-ssa/pr61385.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#define assert(x) if (!(x)) __builtin_abort ()
+
+int a, b, c, d, e, f, g;
+
+int
+fn1 ()
+{
+  int *h = c;
+  for (; c  1; c++)
+{
+  int *i = a, *k = a;
+  f = 0;
+  if (b)
+   return 0;
+  if (*h)
+   {
+ int **j = i;
+ *j = 0;
+ d = 0;
+   }
+  else
+   g = e = 0;
+  if (*h)
+   {
+ int **l = k;
+ *l = g;
+   }
+  d = *h;
+  assert (k == a || k);
+  assert (i);
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 (); 
+  return 0;
+}
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 211221)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -842,20 +842,24 @@ value_replacement (basic_block cond_bb,
   /* Now optimize (x != 0) ? x + y : y to just y.
  The following condition is too restrictive, there can easily be another
  stmt in middle_bb, for instance a CONVERT_EXPR for the second argument.  
*/
   gimple assign = last_and_only_stmt (middle_bb);
   if (!assign || gimple_code (assign) != GIMPLE_ASSIGN
   || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
   || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
   !POINTER_TYPE_P (TREE_TYPE (arg0
 return 0;
 
+  /* Punt if there are (degenerate) PHIs in middle_bb, there should not be.  */
+  if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
+return 0;
+
   /* Only transform if it removes the condition.  */
   if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, 
e1))
 return 0;
 
   /* Size-wise, this is always profitable.  */
   if (optimize_bb_for_speed_p (cond_bb)
   /* The special case is useless if it has a low probability.  */
profile_status_for_fn (cfun) != PROFILE_ABSENT
EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN
   /* If assign is cheap, there is no point avoiding it.  */


Re: emit __float128 typeinfo

2014-06-04 Thread Marc Glisse

On Wed, 4 Jun 2014, Jason Merrill wrote:

How about, in emit_support_tinfos, using type_for_mode to check for a TF-mode 
floating point type different from long_double_type_node?


What should I pass as the mode argument? I can't just write TFmode, that 
will fail to compile on platforms that don't define it, and powerpc seems 
likely to call it JFmode instead quite soon. MAX_MODE_FLOAT maybe? But 
then if we configure with long double = __float128, we will miss 
__float80.


Ah, we walk from GET_CLASS_NARROWEST_MODE (MODE_FLOAT) with 
GET_MODE_WIDER_MODE steps and test if the associated type is not in the 
list 0/float/double/long double. I think it should be ok with arm (it 
would be good if they removed their unused XFmode, but I don't even think 
it is necessary).


Is that what you were suggesting? I'll try to write a patch, thanks.

--
Marc Glisse


Re: [PATCH][match-and-simplify]

2014-06-03 Thread Marc Glisse

On Tue, 3 Jun 2014, Richard Biener wrote:


On Mon, 2 Jun 2014, Marc Glisse wrote:




   (plus (bit_not @0) @0)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   { build_int_cst (TREE_TYPE (@0), -1); })
+(match_and_simplify
+  (plus @0 (bit_not @0))
+  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+  { build_int_cst (TREE_TYPE (@0), -1); })


Why not just:

(match_and_simplify
  (plus @0 (bit_not @0))
  { build_all_ones_cst (TREE_TYPE (@0)); })

? Works for vector/complex, I don't know what type a bit_not_expr can have
where the simplification wouldn't be true.


Sure.


Thanks. Sorry for not being clear enough, the same remark applies to 
basically all the bitwise patterns in the file. I could have saved the 
remark for the pre-merge RFC period. I mostly posted it now so the next 
batch of patterns can be directly written in a general way so you 
(Prathamesh or Richard) have fewer to fix later, but it can certainly 
wait.


/* ~~x - x */
(match_and_simplify
  (bit_not (bit_not @0))
  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  @0)

We can drop the if line.

/* (x | CST1)  CST2 - (x  CST2) | (CST1  CST2) */
(match_and_simplify
  (bit_and (bit_ior @0 INTEGER_CST_P@1) INTEGER_CST_P@2)
  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  (bit_ior (bit_and @0 @2) (bit_and @1 @2)))

Drop the if line and replace INTEGER_CST_P with CONSTANT_CLASS_P.

etc.

--
Marc Glisse


PR61385: phiopt drops some PHIs

2014-06-03 Thread Marc Glisse

Hello,

apparently it is possible to have a PHI in the middle basic block of 
value_replacement, so I need to move it as well when I move the statement 
and remove the block.


Bootstrap and testsuite on x86_64-linux-gnu (re-running for various 
reasons but they had completed successfully yesterday).


2014-06-03  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/61385
gcc/
* tree-ssa-phiopt.c (value_replacement): Copy PHI nodes before
removing the basic block.
gcc/testsuite/
* gcc.dg/tree-ssa/pr61385.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#define assert(x) if (!(x)) __builtin_abort ()
+
+int a, b, c, d, e, f, g;
+
+int
+fn1 ()
+{
+  int *h = c;
+  for (; c  1; c++)
+{
+  int *i = a, *k = a;
+  f = 0;
+  if (b)
+   return 0;
+  if (*h)
+   {
+ int **j = i;
+ *j = 0;
+ d = 0;
+   }
+  else
+   g = e = 0;
+  if (*h)
+   {
+ int **l = k;
+ *l = g;
+   }
+  d = *h;
+  assert (k == a || k);
+  assert (i);
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 (); 
+  return 0;
+}
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 211178)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -877,20 +877,39 @@ value_replacement (basic_block cond_bb,
operand_equal_for_phi_arg_p (rhs2, cond_lhs)
neutral_element_p (code_def, cond_rhs, true))
  || (arg1 == rhs2
   operand_equal_for_phi_arg_p (rhs1, cond_lhs)
   neutral_element_p (code_def, cond_rhs, false))
  || (operand_equal_for_phi_arg_p (arg1, cond_rhs)
   (operand_equal_for_phi_arg_p (rhs2, cond_lhs)
  || operand_equal_for_phi_arg_p (rhs1, cond_lhs))
   absorbing_element_p (code_def, cond_rhs
 {
+  /* Move potential PHI nodes.  */
+  gimple_stmt_iterator psi = gsi_start_phis (middle_bb);
+  while (!gsi_end_p (psi))
+   {
+ gimple phi_moving = gsi_stmt (psi);
+ gimple newphi = create_phi_node (gimple_phi_result (phi_moving),
+  cond_bb);
+ int nargs = cond_bb-preds-length();
+ location_t loc = gimple_phi_arg_location (phi_moving, 0);
+ tree phi_arg = gimple_phi_arg_def (phi_moving, 0);
+ for (int i = 0; i  nargs; ++i)
+   {
+ edge e = (*cond_bb-preds)[i];
+ add_phi_arg (newphi, phi_arg, e, loc);
+   }
+ update_stmt (newphi);
+ gsi_remove (psi, false);
+   }
+
   gsi = gsi_for_stmt (cond);
   gimple_stmt_iterator gsi_from = gsi_for_stmt (assign);
   gsi_move_before (gsi_from, gsi);
   replace_phi_edge_with_variable (cond_bb, e1, phi, lhs);
   return 2;
 }
 
   return 0;
 }
 


Re: calloc = malloc + memset

2014-06-03 Thread Marc Glisse

Ping?

On Sat, 17 May 2014, Marc Glisse wrote:


Ping Jakub?
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01104.html

On Wed, 23 Apr 2014, Richard Biener wrote:


On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote:

Thanks for the comments!


On Fri, 18 Apr 2014, Jakub Jelinek wrote:


The passes.def change makes me a little bit nervous, but if it works,
perhaps.



Would you prefer running the pass twice? I thought there would be less
resistance to moving the pass than duplicating it.


Indeed.  I think placing it after loops and CSE (thus what you have done)
makes sense.  strlenopt itself shouldn't enable much additional
optimizations.  But well, pass ordering is always tricky.

Didn't look at the rest of the changes, but Jakub is certainly able to
approve the patch so I leave it to him.


--
Marc Glisse


Re: PR61385: phiopt drops some PHIs

2014-06-03 Thread Marc Glisse

On Tue, 3 Jun 2014, Richard Biener wrote:


On Tue, Jun 3, 2014 at 3:48 PM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

apparently it is possible to have a PHI in the middle basic block of
value_replacement, so I need to move it as well when I move the statement
and remove the block.

Bootstrap and testsuite on x86_64-linux-gnu (re-running for various reasons
but they had completed successfully yesterday).

2014-06-03  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/61385
gcc/
* tree-ssa-phiopt.c (value_replacement): Copy PHI nodes before
removing the basic block.
gcc/testsuite/
* gcc.dg/tree-ssa/pr61385.c: New file.

--
Marc Glisse
Index: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#define assert(x) if (!(x)) __builtin_abort ()
+
+int a, b, c, d, e, f, g;
+
+int
+fn1 ()
+{
+  int *h = c;
+  for (; c  1; c++)
+{
+  int *i = a, *k = a;
+  f = 0;
+  if (b)
+   return 0;
+  if (*h)
+   {
+ int **j = i;
+ *j = 0;
+ d = 0;
+   }
+  else
+   g = e = 0;
+  if (*h)
+   {
+ int **l = k;
+ *l = g;
+   }
+  d = *h;
+  assert (k == a || k);
+  assert (i);
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 ();
+  return 0;
+}
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 211178)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -877,20 +877,39 @@ value_replacement (basic_block cond_bb,
operand_equal_for_phi_arg_p (rhs2, cond_lhs)
neutral_element_p (code_def, cond_rhs, true))
  || (arg1 == rhs2
   operand_equal_for_phi_arg_p (rhs1, cond_lhs)
   neutral_element_p (code_def, cond_rhs, false))
  || (operand_equal_for_phi_arg_p (arg1, cond_rhs)
   (operand_equal_for_phi_arg_p (rhs2, cond_lhs)
  || operand_equal_for_phi_arg_p (rhs1, cond_lhs))
   absorbing_element_p (code_def, cond_rhs
 {
+  /* Move potential PHI nodes.  */
+  gimple_stmt_iterator psi = gsi_start_phis (middle_bb);
+  while (!gsi_end_p (psi))
+   {
+ gimple phi_moving = gsi_stmt (psi);
+ gimple newphi = create_phi_node (gimple_phi_result (phi_moving),
+  cond_bb);
+ int nargs = cond_bb-preds-length();
+ location_t loc = gimple_phi_arg_location (phi_moving, 0);
+ tree phi_arg = gimple_phi_arg_def (phi_moving, 0);
+ for (int i = 0; i  nargs; ++i)
+   {
+ edge e = (*cond_bb-preds)[i];
+ add_phi_arg (newphi, phi_arg, e, loc);


All arguments get the same value (and the PHI in middle-bb is surely
a singleton?),


Yes, there is a single incoming edge to middle-bb.


so it's way better to re-materialize the PHI as a
gimple assignment at the start of the basic block.


I thought there might be a reason why it was a PHI and not an assignment 
so I wasn't sure doing that would be ok. It is indeed much easier, so I'll 
do that...



If they are
singletons (which I expect), the easiest way is to propagate their
single argument into all uses and simply remove them.


... or indeed that, now that I have found a function called 
replace_uses_by which should do exactly what I need.


Thanks,

--
Marc Glisse


Re: [PATCH][match-and-simplify]

2014-06-02 Thread Marc Glisse



   (plus (bit_not @0) @0)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   { build_int_cst (TREE_TYPE (@0), -1); })
+(match_and_simplify
+  (plus @0 (bit_not @0))
+  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+  { build_int_cst (TREE_TYPE (@0), -1); })


Why not just:

(match_and_simplify
  (plus @0 (bit_not @0))
  { build_all_ones_cst (TREE_TYPE (@0)); })

? Works for vector/complex, I don't know what type a bit_not_expr can have 
where the simplification wouldn't be true.


--
Marc Glisse


Re: [patch] libstdc++/61374 fix string_view conversion and update to latest draft

2014-06-01 Thread Marc Glisse

On Sun, 1 Jun 2014, Daniel Krügler wrote:


2014-06-01 19:24 GMT+02:00 Jonathan Wakely jwak...@redhat.com:

Tested x86_64-linux, committed to trunk.

This should probably go on the 4.9 branch too, although we could leave
the old default cosntructor semantics and just fix the conversion
operator.


Looking at the comparison functions of basic_string_view I noticed
that these are not constexpr (nor seem to be other functions such as
find()).


We may want to wait until the compiler supports C++14 constexpr, trying to 
implement it with C++11 would be doable but not so useful in my opinion.


--
Marc Glisse


Re: [C++11, C++14 PATCH 2/3] Support for SD-6: SG10 Feature Test Recommendations - c-family and testsuite

2014-05-31 Thread Marc Glisse

On Fri, 30 May 2014, Ed Smith-Rowland wrote:


+ cpp_undef (pfile, __cpp_constexpr);
+ cpp_define (pfile, __cpp_constexpr=201304);


Could you set the other value in an else branch to avoid a def undef redef 
game?


Also, I am pretty sure that gcc doesn't support the latest constexpr, we 
shouldn't define those macros lightly.


--
Marc Glisse


Re: [C++11, C++14 PATCH 3/3] Support for SD-6: SG10 Feature Test Recommendations - libstdc++

2014-05-31 Thread Marc Glisse

On Fri, 30 May 2014, Ed Smith-Rowland wrote:


+#define __cpp_lib_constexpr_functions 201210


appears quite a few times, I believe it will warn with -Wsystem-headers. I 
think you should check if it is already defined (or undefine it first).


(you forgot to Cc: libstdc++)

--
Marc Glisse


Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-26 Thread Marc Glisse

On Mon, 26 May 2014, Patrick Palka wrote:


This patch teaches the C++ frontend to warn on NULL checks against
inline functions and it teaches the middle-end to fold NULL checks
against inline functions.  These two things are currently done for
non-inline C++ functions, but inline functions are exceptional in that
the C++ frontend marks them as weak, and NULL checks against weak
symbols cannot be folded in general because the symbol may be mapped to
NULL at link-time.

But in the case of an inline function, the fact that it's a weak symbol
is an implementation detail.  And because it is not permitted to
explicitly give an inline function the weak attribute (see
handle_weak_attribute), in order to acheive $SUBJECT it suffices to
assume that all inline functions are non-null, which is what this patch
does.


Thanks for working on this. You may want to mention PR 59948 in the 
ChangeLog. In the comments of that PR, Honza seems to have a different 
idea of what the fold-const test should look like.


--
Marc Glisse


Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-26 Thread Marc Glisse

On Mon, 26 May 2014, Patrick Palka wrote:


On Mon, May 26, 2014 at 4:26 PM, Marc Glisse marc.gli...@inria.fr wrote:

On Mon, 26 May 2014, Patrick Palka wrote:


This patch teaches the C++ frontend to warn on NULL checks against
inline functions and it teaches the middle-end to fold NULL checks
against inline functions.  These two things are currently done for
non-inline C++ functions, but inline functions are exceptional in that
the C++ frontend marks them as weak, and NULL checks against weak
symbols cannot be folded in general because the symbol may be mapped to
NULL at link-time.

But in the case of an inline function, the fact that it's a weak symbol
is an implementation detail.  And because it is not permitted to
explicitly give an inline function the weak attribute (see
handle_weak_attribute), in order to acheive $SUBJECT it suffices to
assume that all inline functions are non-null, which is what this patch
does.



Thanks for working on this. You may want to mention PR 59948 in the
ChangeLog. In the comments of that PR, Honza seems to have a different idea
of what the fold-const test should look like.


I wonder, then, if the test in c-common.c is also broken...  Thanks
for the heads up about the PR.


I think I should also mention PR 61144 which says the test can be wrong in 
the other direction (I didn't look at it closely).


But even if the tests are broken, your patch may still be an improvement 
while waiting for the rewrite...


--
Marc Glisse


Re: emit __float128 typeinfo

2014-05-23 Thread Marc Glisse

On Wed, 21 May 2014, Jason Merrill wrote:


On 04/25/2014 05:04 AM, Marc Glisse wrote:

Does this approach seem ok, or do we need to try harder to find a way to
get this typeinfo into libsupc++?


The latter, I think; these are base types, so they should go in the library.


Hmm, ok. Because of the arm target, we can't just use the 
register_builtin_type hook as it is. The things I can think of right now 
are:


1) change the prototype of register_builtin_type so it takes an extra bool 
parameter that says if we want to generate runtime stuff like typeinfo (in 
addition to what register_builtin_type is already doing). I would then 
update all target calls with , false and let target maintainers switch 
to true when they are ready.


or

2) create a new register_builtin_type_runtime lang hook that would be 
defined only in (obj-)c++ and would be used only to generate typeinfo.


Does one of those seem acceptable?

Another alternative might be to wait for the intN_t work to land and do 
the same for floatN_t, but that's too big for me.


--
Marc Glisse


Re: [i386] Replace builtins with vector extensions

2014-05-17 Thread Marc Glisse

Ping

On Mon, 28 Apr 2014, Marc Glisse wrote:


Ping
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00590.html

(note that ARM seems to be doing the same thing for their neon intrinsics, 
see Ramana's patch series posted today)


On Fri, 11 Apr 2014, Marc Glisse wrote:


Hello,

the previous discussion on the topic was before we added all those #pragma 
target in *mmintrin.h:


http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html

I believe that removes a large part of the arguments against it. Note that 
I only did a few of the more obvious intrinsics, I am waiting to see if 
this patch is accepted before doing more.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-11  Marc Glisse  marc.gli...@inria.fr

* config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps,
_mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions
instead of builtins.
	* config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, 
_mm_storeh_pd,

_mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd,
_mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64,
_mm_loadh_pd, _mm_loadl_pd): Likewise.
(_mm_sqrt_sd): Fix comment.


--
Marc Glisse


Re: calloc = malloc + memset

2014-05-17 Thread Marc Glisse

Ping Jakub?
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01104.html

On Wed, 23 Apr 2014, Richard Biener wrote:


On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote:

Thanks for the comments!


On Fri, 18 Apr 2014, Jakub Jelinek wrote:


The passes.def change makes me a little bit nervous, but if it works,
perhaps.



Would you prefer running the pass twice? I thought there would be less
resistance to moving the pass than duplicating it.


Indeed.  I think placing it after loops and CSE (thus what you have done)
makes sense.  strlenopt itself shouldn't enable much additional
optimizations.  But well, pass ordering is always tricky.

Didn't look at the rest of the changes, but Jakub is certainly able to
approve the patch so I leave it to him.

Thanks,
Richard.


By the way, I think even
passes we run only once should have the required functions implemented so
they can be run several times (at least most of them), in case users want to
do that in plugins. I was surprised when I tried adding a second strlen pass
and the compiler refused.



--- gcc/testsuite/g++.dg/tree-ssa/calloc.C  (revision 0)
+++ gcc/testsuite/g++.dg/tree-ssa/calloc.C  (working copy)
@@ -0,0 +1,35 @@
+/* { dg-do compile { target c++11 } } */
+/* { dg-options -O3 -fdump-tree-optimized } */
+
+#include new
+#include vector
+#include cstdlib
+
+void g(void*);
+inline void* operator new(std::size_t sz)
+{
+  void *p;
+
+  if (sz == 0)
+sz = 1;
+
+  // Slightly modified from the libsupc++ version, that one has 2 calls
+  // to malloc which makes it too hard to optimize.
+  while ((p = std::malloc (sz)) == 0)
+{
+  std::new_handler handler = std::get_new_handler ();
+  if (! handler)
+throw std::bad_alloc();
+  handler ();
+}
+  return p;
+}
+
+void f(void*p,int n){
+  new(p)std::vectorint(n);
+}
+
+/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
+/* { dg-final { scan-tree-dump-not malloc optimized } } */
+/* { dg-final { scan-tree-dump-not memset optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */



This looks to me way too much fragile, any time the libstdc++
or glibc headers change a little bit, you might need to adjust the
dg-final directives.  Much better would be if you just provided
the prototypes yourself and subset of the std::vector you really need for
the testcase.  You can throw some class or int, it doesn't have to be
std::bad_alloc, etc.



I don't understand what seems so fragile to you. There is a single function
in the .optimized dump, which just calls calloc in a loop. It doesn't seem
that likely that a change in glibc/libstdc++ would make an extra memset pop
up. A change in libstdc++ could easily prevent the optimization completely
(I'd like to hope we can avoid that, half of the purpose of the testcase was
making sure libstdc++ didn't change in a bad way), but I don't really see
how it could keep it in a way that requires tweaking dg-final.

While trying to write a standalone version, I hit again many missed
optimizations, getting such nice things in the .optimized dump as:

  _12 = p_13 + sz_7;
  if (_12 != p_13)

or:

  _12 = p_13 + sz_7;
  _30 = (unsigned long) _12;
  _9 = p_13 + 4;
  _10 = (unsigned long) _9;
  _11 = _30 - _10;
  _22 = _11 /[ex] 4;
  _21 = _22;
  _40 = _21 + 1;
  _34 = _40 * 4;

It is embarrassing... I hope the combiner GSoC will work well and we can
just add a dozen patterns to handle this before 4.10.



--- gcc/testsuite/gcc.dg/strlenopt-9.c  (revision 208772)
+++ gcc/testsuite/gcc.dg/strlenopt-9.c  (working copy)
@@ -11,21 +11,21 @@ fn1 (int r)
  optimized away.  */
   return strchr (p, '\0');
 }

 __attribute__((noinline, noclone)) size_t
 fn2 (int r)
 {
   char *p, q[10];
   strcpy (q, abc);
   p = r ? a : q;
-  /* String length for p varies, therefore strlen below isn't
+  /* String length is constant for both alternatives, and strlen is
  optimized away.  */
   return strlen (p);



Is this because of jump threading?



It is PRE that turns:

  if (r_4(D) == 0)
goto bb 5;
  else
goto bb 3;

  bb 5:
  goto bb 4;

  bb 3:

  bb 4:
  # p_1 = PHI q(5), a(3)
  _5 = __builtin_strlen (p_1);

into:

  if (r_4(D) == 0)
goto bb 5;
  else
goto bb 3;

  bb 5:
  _7 = __builtin_strlen (q);
  pretmp_8 = _7;
  goto bb 4;

  bb 3:

  bb 4:
  # p_1 = PHI q(5), a(3)
  # prephitmp_9 = PHI pretmp_8(5), 1(3)
  _5 = prephitmp_9;

It says:

Found partial redundancy for expression
{call_expr__builtin_strlen,p_1}@.MEM_3 (0005)



--- gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(working copy)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+#include stdlib.h
+#include string.h



Even this I find unsafe.  The strlenopt*.c tests use it's custom
strlenopt.h header for a reason, you might just add a calloc
prototype in there and use that header.



Might as well use __builtin_* then.



+/* Handle a call

Re: [v3] Slightly improve operator new

2014-05-17 Thread Marc Glisse

Ping.

On Tue, 15 Apr 2014, Marc Glisse wrote:


Ping
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00051.html


On Sun, 2 Mar 2014, Marc Glisse wrote:


Hello,

inlining operator new (with LTO or otherwise), I noticed that it has a 
complicated implementation, which makes it hard to use this inlined code 
for optimizations. This patch does two things:


1) there are 2 calls to malloc, I am turning them into just one. At -Os, it 
does not change the generated code (RTL optimizers manage to merge the 
calls to malloc). At other levels (-O2, -O3, and especially with -g) it 
gives a smaller object file. And with just one malloc, some optimizations 
become much easier (see my recent calloc patch for instance).


2) malloc is predicted to return null 19 times out of 20 because of the 
loop (that didn't change with the patch), so I am adding __builtin_expect 
to let gcc optimize the fast path.


Further discussion:

a) I didn't add __builtin_expect for the test (sz == 0), it didn't change 
the generated code in my limited test. I was wondering if this test is 
necessary (new doesn't seem to ever call operator new(0)) or could be moved 
to operator new[] (new type[0] does call operator new[](0)), but since one 
can call operator new directly, it has to be protected indeed, so let's 
forget this point ;-)
(too bad malloc is replacable, so we can't use the fact that glibc already 
does the right thing)


b) I have a bit of trouble parsing the standard. Is the nothrow operator 
new supposed to call the regular operator new? In particular, if a user 
replaces only the throwing operator new, should the nothrow operator new 
automatically call that function? That's not what we are currently doing 
(and it would be a perf regression).


Required behavior: Return a non-null pointer to suitably aligned storage 
(3.7.4), or else return a null pointer. This nothrow version of operator 
new returns a pointer obtained as if acquired from the (possibly replaced) 
ordinary version. This requirement is binding on a replacement version of 
this function.


Default behavior: Calls operator new(size). If the call returns normally, 
returns the result of that call. Otherwise, returns a null pointer.




Passes bootstrap+testsuite on x86_64-linux-gnu. Stage 1?

2014-03-03  Marc Glisse  marc.gli...@inria.fr

	* libsupc++/new_op.cc: Factor the calls to malloc, use 
__builtin_expect.

* libsupc++/new_opnt.cc: Likewise.


--
Marc Glisse


Re: PR61140: check the phi is unique in value_replacement

2014-05-11 Thread Marc Glisse

On Sat, 10 May 2014, Andrew Pinski wrote:


On Sat, May 10, 2014 at 3:53 PM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

in my recent phiopt patch enhancing value_replacement to optimize
x!=0?x+y:y, I forgot to check that there is no other PHI (not sure how I
managed to miss that since I copy-pasted the line just below the test).

If there are other phi nodes (with different arguments for those 2
branches), it would be possible to replace the phi argument and stop there
(as value_replacement does for its other transformation). However, I am
chosing to punt. The cost analysis would be different, and I wrote the
transformation assuming that this single-phi test was already done higher in
the function.


I think we should have some good cost analysis because for this
testcase, we should be able to get only one conditional move but right
now with punting we don't.


That's true. But note that the transformation is already very limited 
(gives up if there is a second statement in the middle bb, even a simple 
cast), so I would like to first quickly get the wrong-code regression out 
of the way, and we can make improvements afterwards (though we can of 
course start discussing them now).


It seems like if there is only 1 extra non-singleton phi (in addition to 
the one we are transforming) and the target supports conditional move for 
this type and the direct branch has proba  50%, with the other 
restrictions already in place, we could go ahead. How does that sound? Not 
too specialized? If there are many phis, conditional moves are out, the 
branch will stay, and unless the edge to the operation has a very high 
proba, it doesn't seem like a good idea to pull the operation out of the 
branch.


--
Marc Glisse


PR61140: check the phi is unique in value_replacement

2014-05-10 Thread Marc Glisse

Hello,

in my recent phiopt patch enhancing value_replacement to optimize 
x!=0?x+y:y, I forgot to check that there is no other PHI (not sure how I 
managed to miss that since I copy-pasted the line just below the test).


If there are other phi nodes (with different arguments for those 2 
branches), it would be possible to replace the phi argument and stop there 
(as value_replacement does for its other transformation). However, I am 
chosing to punt. The cost analysis would be different, and I wrote the 
transformation assuming that this single-phi test was already done higher 
in the function.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-05-12  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/61140
gcc/
* tree-ssa-phiopt.c (value_replacement): Punt on multiple phis.
gcc/testsuite/
* gcc.dg/tree-ssa/pr61140.c: New file.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61140.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr61140.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr61140.c (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options -O2 } */
+
+int a[1] = { 1 }, b = 1, c; 
+
+int
+main ()
+{
+  for (; c  1; c++)
+if (a[0])
+{
+  a[0] = 1;
+  b = 0;
+}
+  if (b)
+__builtin_abort ();
+  return 0;
+}
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 210301)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -842,20 +842,24 @@ value_replacement (basic_block cond_bb,
   /* Now optimize (x != 0) ? x + y : y to just y.
  The following condition is too restrictive, there can easily be another
  stmt in middle_bb, for instance a CONVERT_EXPR for the second argument.  
*/
   gimple assign = last_and_only_stmt (middle_bb);
   if (!assign || gimple_code (assign) != GIMPLE_ASSIGN
   || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
   || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
   !POINTER_TYPE_P (TREE_TYPE (arg0
 return 0;
 
+  /* Only transform if it removes the condition.  */
+  if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, 
e1))
+return 0;
+
   /* Size-wise, this is always profitable.  */
   if (optimize_bb_for_speed_p (cond_bb)
   /* The special case is useless if it has a low probability.  */
profile_status_for_fn (cfun) != PROFILE_ABSENT
EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN
   /* If assign is cheap, there is no point avoiding it.  */
estimate_num_insns (assign, eni_time_weights)
 = 3 * estimate_num_insns (cond, eni_time_weights))
 return 0;
 


Re: [patch] change specific int128 - generic intN

2014-05-09 Thread Marc Glisse

On Thu, 8 May 2014, DJ Delorie wrote:


Assuming that the formula sizeof(type)*char_bit==precision works for all


It doesn't.  THe MSP430 has __int20 for example.


Well, it wasn't a hard requirement, it is just that the library has to 
use a more complicated way to get the precision (use (unsigned TYPE)(-1) 
to get the unsigned max and compute the precision from that, probably).



Would it be acceptable for the compiler to always define a set of
macros for each of the intN types?


What set of macros do you have in mind?


I would have thought that would be discouraged,


If we can't think of another way...

--
Marc Glisse


Re: emit __float128 typeinfo

2014-05-08 Thread Marc Glisse

Ping
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01651.html


On Fri, 25 Apr 2014, Marc Glisse wrote:


On Fri, 25 Apr 2014, Marc Glisse wrote:

the previous patch had to be reverted as it broke the strange handling of 
vectors in the ARM target. This new patch should be much more conservative 
I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE 
know that it isn't available in libsupc++. There are 2 versions, both 
regtested fine.


Does this approach seem ok, or do we need to try harder to find a way to 
get this typeinfo into libsupc++?


2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (emit_support_tinfos): Move the array...
(fundamentals): ... and make it global.
(typeinfo_in_lib_p): Use it.

2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a
hardcoded list of available types.


It seems better with a TYPE_CANONICAL in there. It passed bootstrap and the 
testsuite is running.


--
Marc Glisse


Re: [patch] change specific int128 - generic intN

2014-05-08 Thread Marc Glisse

On Thu, 8 May 2014, DJ Delorie wrote:


The libstdc++v3 headers have __int128 hard-coded all over the place.
Any suggestions on parameterizing those for the __intN types that are
actually supported by the target?


(adding libstdc++@ in Cc:)

The first idea that comes to mind (so possibly not such a good one) is to 
provide predefined macros:

#define __EXTENDED_INTEGER_TYPE_1__ __int24
#define __EXTENDED_INTEGER_TYPE_2__ __int128
#undef __EXTENDED_INTEGER_TYPE_3__

Assuming that the formula sizeof(type)*char_bit==precision works for all 
types, it should be sufficient for the library (abs, type_traits and 
numeric_limits).


--
Marc Glisse


Re: Optimize n?rotate(x,n):x

2014-05-01 Thread Marc Glisse

Hello,

here is the latest version. Reviewers seemed happy with different 
versions, so I went for the simplest one. We only give up on the 
transformation if we are optimizing for speed, the short-cut has 
probability 50% and the operation the branch is avoiding is expensive 
(i.e. only divisions, with the current estimate_num_insns). The optab 
checks are gone, they should eventually be added to estimate_num_insns 
instead.


I believe this is covered by the previous ok, but I won't commit 
anything before Tuesday.


It would have been more general (and shorter) to call fold after 
substituting the tested value and see if the result matches the other phi 
argument (it might handle (a!=b)?a|b:a for instance), instead of the 
*_element_p tests. I may try that later, but I guess the current patch is 
good enough for now.



2014-05-06  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/59100
gcc/
* tree-ssa-phiopt.c: Include tree-inline.h.
(neutral_element_p, absorbing_element_p): New functions.
(value_replacement): Handle conditional binary operations with a
neutral or absorbing element.
gcc/testsuite/
* gcc.dg/tree-ssa/phi-opt-12.c: New file.
* gcc.dg/tree-ssa/phi-opt-13.c: Likewise.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-phiopt1 } */
+
+int f(int a, int b, int c) {
+  if (c  5) return c;
+  if (a == 0) return b;
+  return a + b;
+}
+
+unsigned rot(unsigned x, int n) {
+  const int bits = __CHAR_BIT__ * __SIZEOF_INT__;
+  return (n == 0) ? x : ((x  n) | (x  (bits - n)));
+}
+
+unsigned m(unsigned a, unsigned b) {
+  if (a == 0)
+return 0;
+  else
+return a  b;
+}
+
+/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */
+/* { dg-final { cleanup-tree-dump phiopt1 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+// Division is expensive
+long f(long a, long b) {
+  if (__builtin_expect(b == 1, 1)) return a;
+  return a / b;
+}
+
+/* { dg-final { scan-tree-dump-times goto  2 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 209979)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -47,20 +47,21 @@ along with GCC; see the file COPYING3.
 #include tree-pass.h
 #include langhooks.h
 #include domwalk.h
 #include cfgloop.h
 #include tree-data-ref.h
 #include gimple-pretty-print.h
 #include insn-config.h
 #include expr.h
 #include optabs.h
 #include tree-scalar-evolution.h
+#include tree-inline.h
 
 #ifndef HAVE_conditional_move
 #define HAVE_conditional_move (0)
 #endif
 
 static unsigned int tree_ssa_phiopt_worker (bool, bool);
 static bool conditional_replacement (basic_block, basic_block,
 edge, edge, gimple, tree, tree);
 static int value_replacement (basic_block, basic_block,
  edge, edge, gimple, tree, tree);
@@ -652,20 +653,78 @@ operand_equal_for_value_replacement (con
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;
 
   tmp = gimple_assign_rhs2 (def);
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;
 
   return false;
 }
 
+/* Returns true if ARG is a neutral element for operation CODE
+   on the RIGHT side.  */
+
+static bool
+neutral_element_p (tree_code code, tree arg, bool right)
+{
+  switch (code)
+{
+case PLUS_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  return integer_zerop (arg);
+
+case LROTATE_EXPR:
+case RROTATE_EXPR:
+case LSHIFT_EXPR:
+case RSHIFT_EXPR:
+case MINUS_EXPR:
+case POINTER_PLUS_EXPR:
+  return right  integer_zerop (arg);
+
+case MULT_EXPR:
+  return integer_onep (arg);
+
+case TRUNC_DIV_EXPR:
+case CEIL_DIV_EXPR:
+case FLOOR_DIV_EXPR:
+case ROUND_DIV_EXPR:
+case EXACT_DIV_EXPR:
+  return right  integer_onep (arg);
+
+case BIT_AND_EXPR:
+  return integer_all_onesp (arg);
+
+default:
+  return false;
+}
+}
+
+/* Returns true if ARG is an absorbing element for operation CODE.  */
+
+static bool
+absorbing_element_p (tree_code code, tree arg)
+{
+  switch (code)
+{
+case BIT_IOR_EXPR:
+  return integer_all_onesp (arg);
+
+case MULT_EXPR:
+case BIT_AND_EXPR:
+  return integer_zerop (arg);
+
+default:
+  return false

Re: [PATCH], RFC, add support for __float128/__ibm128 types on PowerPC

2014-04-29 Thread Marc Glisse

Minor detail:

+  if ((flags  OPTION_MASK_FLOAT128) != 0)
+rs6000_define_or_undefine_macro (define_p, __FLOAT128__);

I recently added __SIZEOF_FLOAT128__ to the x86 target to advertise the 
availability of __float128, it would be good if we had a common macro (it 
can be in addition to __FLOAT128__). If you really don't like it, we can 
ask x86 maintainers if they like __FLOAT128__.


--
Marc Glisse


Re: [i386] Replace builtins with vector extensions

2014-04-28 Thread Marc Glisse

Ping
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00590.html

(note that ARM seems to be doing the same thing for their neon 
intrinsics, see Ramana's patch series posted today)


On Fri, 11 Apr 2014, Marc Glisse wrote:


Hello,

the previous discussion on the topic was before we added all those #pragma 
target in *mmintrin.h:


http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html

I believe that removes a large part of the arguments against it. Note that I 
only did a few of the more obvious intrinsics, I am waiting to see if this 
patch is accepted before doing more.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-11  Marc Glisse  marc.gli...@inria.fr

* config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps,
_mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions
instead of builtins.
	* config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, 
_mm_storeh_pd,

_mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd,
_mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64,
_mm_loadh_pd, _mm_loadl_pd): Likewise.
(_mm_sqrt_sd): Fix comment.


--
Marc Glisse


Re: [PATCH] Implement -fsanitize=float-divide-by-zero

2014-04-28 Thread Marc Glisse

On Mon, 28 Apr 2014, Marek Polacek wrote:


This patch implements -fsanitize=float-divide-by-zero option that can
be used to detect division by zero even when using floating types.
Most of the code in ubsan_instrument_division was ready for this
so this was mainly about handling REAL_TYPE there.


Ideally this would all be unneeded, you would compile your program with 
pragma stdc fenv_access on, on glibc you would call 
feenableexcept(FE_DIVBYZERO) at the beginning of the program, done (I may 
have listed the wrong function, I am always confused by those names).


But I guess we won't be there anytime soon, so in the mean time...


Since division by a floating point zero can be a valid way of
obtaining infinities and NaNs, I'm not 100% sure this ought to be
enabled by default (that is, enabled when -fsanitize=undefined is
specified).


Please don't enable it with undefined. As you say, it is well defined 
(except when finite-math-only is in effect). If you want a meta-category 
for this kind of valid thing, maybe -fsanitize=unusual or 
-fsanitize=suspicious.


--
Marc Glisse


Re: version typeinfo for 128bit types

2014-04-25 Thread Marc Glisse

On Fri, 25 Apr 2014, Jonathan Wakely wrote:


On 24/04/14 20:03 +0200, Marc Glisse wrote:
Grep seems to indicate that the manual is the only other place that needs 
updating, but that can wait.


Is this patch ok, assuming the tests pass?


OK, and sorry for forgetting about that file in the testsuite!


Done.

You might also want to update the latestp check to GLIBCXX_3.4.21 and 
CXXABI_TM_2 since I think adding symbols to GLIBCXX_3.4.20 or CXXABI_TM_1 
at this point should make a noisy error.


--
Marc Glisse


emit __float128 typeinfo

2014-04-25 Thread Marc Glisse

Hello,

the previous patch had to be reverted as it broke the strange handling of 
vectors in the ARM target. This new patch should be much more conservative 
I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE 
know that it isn't available in libsupc++. There are 2 versions, both 
regtested fine.


Does this approach seem ok, or do we need to try harder to find a way to 
get this typeinfo into libsupc++?


2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (emit_support_tinfos): Move the array...
(fundamentals): ... and make it global.
(typeinfo_in_lib_p): Use it.

2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a
hardcoded list of available types.

--
Marc GlisseIndex: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 209787)
+++ gcc/cp/rtti.c   (working copy)
@@ -1042,42 +1042,68 @@ class_initializer (tinfo_s *ti, tree tar
   for (i = 0; i  n; i++)
 CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, va_arg (extra_inits, tree));
   va_end (extra_inits);
 
   init = build_constructor (init_list_type_node, v);
   TREE_CONSTANT (init) = 1;
   TREE_STATIC (init) = 1;
   return init;
 }
 
+/* List of types for which the typeinfo should be placed in the
+   runtime library.
+   Dummy static variable so we can put nullptr in the array; it will be
+   set before we actually start to walk the array.  */
+static tree *const fundamentals[] =
+{
+  void_type_node,
+  boolean_type_node,
+  wchar_type_node, char16_type_node, char32_type_node,
+  char_type_node, signed_char_type_node, unsigned_char_type_node,
+  short_integer_type_node, short_unsigned_type_node,
+  integer_type_node, unsigned_type_node,
+  long_integer_type_node, long_unsigned_type_node,
+  long_long_integer_type_node, long_long_unsigned_type_node,
+  int128_integer_type_node, int128_unsigned_type_node,
+  float_type_node, double_type_node, long_double_type_node,
+  dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
+  nullptr_type_node,
+  0
+};
+
 /* Returns true if the typeinfo for type should be placed in
the runtime library.  */
 
 static bool
 typeinfo_in_lib_p (tree type)
 {
   /* The typeinfo objects for `T*' and `const T*' are in the runtime
  library for simple types T.  */
   if (TYPE_PTR_P (type)
(cp_type_quals (TREE_TYPE (type)) == TYPE_QUAL_CONST
  || cp_type_quals (TREE_TYPE (type)) == TYPE_UNQUALIFIED))
 type = TREE_TYPE (type);
 
   switch (TREE_CODE (type))
 {
-case INTEGER_TYPE:
 case BOOLEAN_TYPE:
-case REAL_TYPE:
 case VOID_TYPE:
 case NULLPTR_TYPE:
   return true;
 
+case INTEGER_TYPE:
+case REAL_TYPE:
+  for (int ix = 0; fundamentals[ix]; ix++)
+   if (*fundamentals[ix] == type)
+ return true;
+  return false;
+
 case LANG_TYPE:
   /* fall through.  */
 
 default:
   return false;
 }
 }
 
 /* Generate the initializer for the type info describing TYPE.  TK_INDEX is
the index of the descriptor in the tinfo_desc vector. */
@@ -1505,38 +1531,20 @@ emit_support_tinfo_1 (tree bltn)
 
 /* Emit the type_info descriptors which are guaranteed to be in the runtime
support.  Generating them here guarantees consistency with the other
structures.  We use the following heuristic to determine when the runtime
is being generated.  If std::__fundamental_type_info is defined, and its
destructor is defined, then the runtime is being built.  */
 
 void
 emit_support_tinfos (void)
 {
-  /* Dummy static variable so we can put nullptr in the array; it will be
- set before we actually start to walk the array.  */
-  static tree *const fundamentals[] =
-  {
-void_type_node,
-boolean_type_node,
-wchar_type_node, char16_type_node, char32_type_node,
-char_type_node, signed_char_type_node, unsigned_char_type_node,
-short_integer_type_node, short_unsigned_type_node,
-integer_type_node, unsigned_type_node,
-long_integer_type_node, long_unsigned_type_node,
-long_long_integer_type_node, long_long_unsigned_type_node,
-int128_integer_type_node, int128_unsigned_type_node,
-float_type_node, double_type_node, long_double_type_node,
-dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
-nullptr_type_node,
-0
-  };
   int ix;
   tree bltn_type, dtor;
 
   push_abi_namespace ();
   bltn_type = xref_tag (class_type,
get_identifier (__fundamental_type_info),
/*tag_scope=*/ts_current, false);
   pop_abi_namespace ();
   if (!COMPLETE_TYPE_P (bltn_type))
 return;
Index: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 209787)
+++ gcc/cp/rtti.c   (working copy)
@@ -1059,25 +1059,32 @@ typeinfo_in_lib_p (tree type

Re: emit __float128 typeinfo

2014-04-25 Thread Marc Glisse

On Fri, 25 Apr 2014, Marc Glisse wrote:

the previous patch had to be reverted as it broke the strange handling of 
vectors in the ARM target. This new patch should be much more conservative I 
hope. Instead of adding this typeinfo to libsupc++, I am letting the FE know 
that it isn't available in libsupc++. There are 2 versions, both regtested 
fine.


Does this approach seem ok, or do we need to try harder to find a way to get 
this typeinfo into libsupc++?


2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (emit_support_tinfos): Move the array...
(fundamentals): ... and make it global.
(typeinfo_in_lib_p): Use it.

2014-04-25  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a
hardcoded list of available types.


It seems better with a TYPE_CANONICAL in there. It passed bootstrap and 
the testsuite is running.


--
Marc GlisseIndex: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 209789)
+++ gcc/cp/rtti.c   (working copy)
@@ -1042,42 +1042,69 @@ class_initializer (tinfo_s *ti, tree tar
   for (i = 0; i  n; i++)
 CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, va_arg (extra_inits, tree));
   va_end (extra_inits);
 
   init = build_constructor (init_list_type_node, v);
   TREE_CONSTANT (init) = 1;
   TREE_STATIC (init) = 1;
   return init;
 }
 
+/* List of types for which the typeinfo should be placed in the
+   runtime library.
+   Dummy static variable so we can put nullptr in the array; it will be
+   set before we actually start to walk the array.  */
+static tree *const fundamentals[] =
+{
+  void_type_node,
+  boolean_type_node,
+  wchar_type_node, char16_type_node, char32_type_node,
+  char_type_node, signed_char_type_node, unsigned_char_type_node,
+  short_integer_type_node, short_unsigned_type_node,
+  integer_type_node, unsigned_type_node,
+  long_integer_type_node, long_unsigned_type_node,
+  long_long_integer_type_node, long_long_unsigned_type_node,
+  int128_integer_type_node, int128_unsigned_type_node,
+  float_type_node, double_type_node, long_double_type_node,
+  dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
+  nullptr_type_node,
+  0
+};
+
 /* Returns true if the typeinfo for type should be placed in
the runtime library.  */
 
 static bool
 typeinfo_in_lib_p (tree type)
 {
   /* The typeinfo objects for `T*' and `const T*' are in the runtime
  library for simple types T.  */
   if (TYPE_PTR_P (type)
(cp_type_quals (TREE_TYPE (type)) == TYPE_QUAL_CONST
  || cp_type_quals (TREE_TYPE (type)) == TYPE_UNQUALIFIED))
 type = TREE_TYPE (type);
 
   switch (TREE_CODE (type))
 {
-case INTEGER_TYPE:
 case BOOLEAN_TYPE:
-case REAL_TYPE:
 case VOID_TYPE:
 case NULLPTR_TYPE:
   return true;
 
+case INTEGER_TYPE:
+case REAL_TYPE:
+  type = TYPE_CANONICAL (type);
+  for (int ix = 0; fundamentals[ix]; ix++)
+   if (*fundamentals[ix] == type)
+ return true;
+  return false;
+
 case LANG_TYPE:
   /* fall through.  */
 
 default:
   return false;
 }
 }
 
 /* Generate the initializer for the type info describing TYPE.  TK_INDEX is
the index of the descriptor in the tinfo_desc vector. */
@@ -1505,38 +1532,20 @@ emit_support_tinfo_1 (tree bltn)
 
 /* Emit the type_info descriptors which are guaranteed to be in the runtime
support.  Generating them here guarantees consistency with the other
structures.  We use the following heuristic to determine when the runtime
is being generated.  If std::__fundamental_type_info is defined, and its
destructor is defined, then the runtime is being built.  */
 
 void
 emit_support_tinfos (void)
 {
-  /* Dummy static variable so we can put nullptr in the array; it will be
- set before we actually start to walk the array.  */
-  static tree *const fundamentals[] =
-  {
-void_type_node,
-boolean_type_node,
-wchar_type_node, char16_type_node, char32_type_node,
-char_type_node, signed_char_type_node, unsigned_char_type_node,
-short_integer_type_node, short_unsigned_type_node,
-integer_type_node, unsigned_type_node,
-long_integer_type_node, long_unsigned_type_node,
-long_long_integer_type_node, long_long_unsigned_type_node,
-int128_integer_type_node, int128_unsigned_type_node,
-float_type_node, double_type_node, long_double_type_node,
-dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
-nullptr_type_node,
-0
-  };
   int ix;
   tree bltn_type, dtor;
 
   push_abi_namespace ();
   bltn_type = xref_tag (class_type,
get_identifier (__fundamental_type_info),
/*tag_scope=*/ts_current, false);
   pop_abi_namespace ();
   if (!COMPLETE_TYPE_P (bltn_type))
 return;
Index: gcc/cp/rtti.c

Re: [c++] typeinfo for target types

2014-04-24 Thread Marc Glisse

On Thu, 24 Apr 2014, Ramana Radhakrishnan wrote:


On Wed, Apr 23, 2014 at 8:43 PM, Marc Glisse marc.gli...@inria.fr wrote:

On Wed, 23 Apr 2014, Richard Henderson wrote:


On 04/13/2014 01:41 AM, Marc Glisse wrote:


Hello,

this patch generates typeinfo for target types. On x86_64, it adds these
6
lines to nm -C libsupc++.a. A follow-up patch will be needed to export
and
version those in the shared library.

+ V typeinfo for __float128
+ V typeinfo for __float128 const*
+ V typeinfo for __float128*
+ V typeinfo name for __float128
+ V typeinfo name for __float128 const*
+ V typeinfo name for __float128*

Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in
tsan/tls_race.c).

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/c-family/
* c-common.c (registered_builtin_types): Make non-static.
* c-common.h (registered_builtin_types): Declare.
gcc/cp/
* rtti.c (emit_support_tinfo_1): New function, extracted from
emit_support_tinfos.
(emit_support_tinfos): Call it and iterate on
registered_builtin_types.



This is causing aarch64 builds to break.



If it is causing too much trouble, we could ifdef out the last 2 lines of
emit_support_tinfos and revert the libstdc++ changes (or even revert the
whole thing).



Any c++ compilation aborts at



That's surprising, the code I touched is only ever supposed to run while
compiling one file in libsupc++, if I understand correctly.



#0  fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303,
   function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__
   write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190
#1  0x007ce2b4 in write_builtin_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:2303
#2  0x007cc85c in write_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:1969
#3  0x007d4d98 in mangle_special_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df,
   code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569
#4  0x007d4dcc in mangle_typeinfo_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:3585
#5  0x0070618c in get_tinfo_decl (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:422
#6  0x00709ff0 in emit_support_tinfo_1 (
   bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:1485
#7  0x0070a344 in emit_support_tinfos ()
   at ../../git-rh/gcc/cp/rtti.c:1550

Presumably the backend needs to grow some mangling support for its
builtins,



aarch64 has complicated builtins... __builtin_aarch64_simd_df uses
double_aarch64_type_node which is not the same as double_type_node. I mostly
looked at the x86 backend, so I didn't notice that aarch64 registers a lot
more builtins.



but in the meantime can we do something less drastic than abort?



Sounds good, but I am not sure how exactly. We could use a separate hook
(register_builtin_type_for_typeinfo?) so back-ends have to explicitly say
they want typeinfo, but it is ugly having to register types multiple times.
We could add a parameter to the existing register_builtin_type saying
whether we want typeinfo, but that means updating all back-ends.


We could also make typeinfo_in_lib_p more strict so for REAL_TYPE it only 
returns true for the types listed in fundamentals.



Well some of these scalar types are not really user visible which is
where I believe the problem is coming from and prima-facie I don't
think we should be inventing mangling for some of these internal
types.


If the types are not user-visible, it is not clear to me why they need to 
be registered with the front-end...



 We could
get the mangling functions to take a parameter that says whether errors
should be fatal and skip generating the typeinfo when we can't mangle, but
there is no convenient way to communicate this mangling failure (0 bytes
written?).

Would mangling the aarch64 builtins be a lot of work? Did other platforms
break as well?


It's not a lot of work but I'd like to make sure we're doing the right
thing on both AArch32 and AArch64. So, for now can we just revert this
till the thing is sorted out.


Ok, I'll commit the attached as soon as I've checked it isn't too broken. 
It is not a complete revert: splitting the rtti function is still cleaner, 
and the int128 symbols are still there.


2014-04-24  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/cp/
* rtti.c (emit_support_tinfos): Do not iterate on
registered_builtin_types (partial revert).
libstdc++/
* config/abi/pre/gnu.ver (CXXABI_1.3.9): Remove __float128 symbols.
* config/abi/pre/gnu-versioned-namespace.ver: Likewise.
* config/abi/post/x86_64

Re: [c++] typeinfo for target types

2014-04-24 Thread Marc Glisse

On Thu, 24 Apr 2014, Ramana Radhakrishnan wrote:


Well some of these scalar types are not really user visible which is
where I believe the problem is coming from and prima-facie I don't
think we should be inventing mangling for some of these internal
types.



If the types are not user-visible, it is not clear to me why they need to be
registered with the front-end...


The vector types that are built on this are user visible, so I suspect
that's why the scalar types need to be registered with the front-end.

A lot of this comes from the original support for the intrinsics way
that goes quite some time back so there is some digging needed here.


Problematic part of the patch was reverted, arm and aarch64 should be ok
now. Please do investigate...

--
Marc Glisse


Re: version typeinfo for 128bit types

2014-04-24 Thread Marc Glisse

On Thu, 24 Apr 2014, Rainer Orth wrote:


Marc Glisse marc.gli...@inria.fr writes:


this is a follow-up for this patch:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html

once committed, g++ will generate typeinfo for __float128, and it needs
versioning. While there, I noticed that __int128 has typeinfo but not
typeinfo name, so I am adding it. I manually checked that the new symbols
were exactly the 12 I expected, with the new version number.

I did not test the gnu-versioned-namespace version.

I manually updated baseline for x86_64. It is awfully inconvenient to do. I
was expecting make new-abi-baseline to generate it for me, but it gives
me plenty of extra symbols compared to the current one. Some random
examples:


It shouldn't be necessary to update all baselines whenever you add a new
version to libstdc++.so.6.  It seems to me that when you added
CXXABI_1.3.9, you forgot to update
libstdc++-v3/testsuite/util/testsuite_abi.cc for that.


I had no idea this file even existed, thanks for the pointer! It makes so 
much more sense this way :-)


Grep seems to indicate that the manual is the only other place that needs 
updating, but that can wait.


Is this patch ok, assuming the tests pass?

2014-04-24  Marc Glisse  marc.gli...@inria.fr

* testsuite/util/testsuite_abi.cc (check_version): Update for
CXXABI_1.3.9.

--
Marc GlisseIndex: libstdc++-v3/testsuite/util/testsuite_abi.cc
===
--- libstdc++-v3/testsuite/util/testsuite_abi.cc(revision 209755)
+++ libstdc++-v3/testsuite/util/testsuite_abi.cc(working copy)
@@ -203,38 +203,39 @@ check_version(symbol test, bool added)
   known_versions.push_back(CXXABI_1.3);
   known_versions.push_back(CXXABI_LDBL_1.3);
   known_versions.push_back(CXXABI_1.3.1);
   known_versions.push_back(CXXABI_1.3.2);
   known_versions.push_back(CXXABI_1.3.3);
   known_versions.push_back(CXXABI_1.3.4);
   known_versions.push_back(CXXABI_1.3.5);
   known_versions.push_back(CXXABI_1.3.6);
   known_versions.push_back(CXXABI_1.3.7);
   known_versions.push_back(CXXABI_1.3.8);
+  known_versions.push_back(CXXABI_1.3.9);
   known_versions.push_back(CXXABI_TM_1);
 }
   compat_list::iterator begin = known_versions.begin();
   compat_list::iterator end = known_versions.end();
 
   // Check for compatible version.
   if (test.version_name.size())
 {
   compat_list::iterator it1 = find(begin, end, test.version_name);
   compat_list::iterator it2 = find(begin, end, test.name);
   if (it1 != end)
test.version_status = symbol::compatible;
   else
test.version_status = symbol::incompatible;
 
   // Check that added symbols are added in the latest pre-release version.
   bool latestp = (test.version_name == GLIBCXX_3.4.20
-|| test.version_name == CXXABI_1.3.8
+|| test.version_name == CXXABI_1.3.9
 || test.version_name == CXXABI_TM_1);
   if (added  !latestp)
test.version_status = symbol::incompatible;
 
   // Check that long double compatibility symbols demangled as
   // __float128 are put into some _LDBL_ version name.
   if (added  test.demangled_name.find(__float128) != std::string::npos)
{
  // Has to be in _LDBL_ version name.
  if (test.version_name.find(_LDBL_) == std::string::npos)


Re: [C PATCH] Improve error on attrs after declarator in a fndef (PR c/60915)

2014-04-24 Thread Marc Glisse

On Thu, 24 Apr 2014, Marek Polacek wrote:


This PR is about not very clear error message when one tries to
add attributes *after* the declarator in a function definition.
cc1plus already handles this well, so I used the same message.


I thought you had an earlier version of the patch where, instead of 
forbidden, you said move the attribute so users would know that 
writing the attribute *before* was fine? That sounded much better to me.


--
Marc Glisse


Re: Optimize n?rotate(x,n):x

2014-04-23 Thread Marc Glisse

Honza, any comment on Richard's question?

On Tue, 15 Apr 2014, Richard Biener wrote:


On Mon, Apr 14, 2014 at 6:40 PM, Marc Glisse marc.gli...@inria.fr wrote:

On Mon, 14 Apr 2014, Richard Biener wrote:


+  /* If the special case has a high probability, keep it.  */
+  if (EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN)



I suppose Honza has a comment on how to test this properly
(not sure if -probability or -frequency is always initialized properly).
for example single_likely_edge tests profile_status_for_fn !=
PROFILE_ABSENT (and uses a fixed probability value ...).
Anyway, the comparison looks backwards to me, but maybe I'm
missing sth - I'd use = PROB_LIKELY ;)



Maybe the comment is confusing? middle_bb contains the expensive operation
(say a/b) that the special case skips entirely. If the division happens in
less than 50% of cases (that's the proba of the edge going from cond to
middle_bb), then doing the comparison+jump may be cheaper and I abort the
optimization. At least the testcase with __builtin_expect should prove that
I didn't do it backwards.


Ah, indeed.  My mistake.


value-prof seems to use 50% as the cut-off where it may become interesting
to special case division, hence my choice of PROB_EVEN. I am not sure which
way you want to use PROB_LIKELY (80%). If we have more than 80% chances of
executing the division, always perform it? Or if we have more than 80%
chances of skipping the division, keep the branch?


Ok, if it's from value-prof then that's fine.

The patch is ok if Honza doesn't have any comments on whether it's ok
to look at -probability unconditionally.

Thanks,
Richard.


Attached is the latest version (passed the testsuite).
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-phiopt1 } */
+
+int f(int a, int b, int c) {
+  if (c  5) return c;
+  if (a == 0) return b;
+  return a + b;
+}
+
+unsigned rot(unsigned x, int n) {
+  const int bits = __CHAR_BIT__ * __SIZEOF_INT__;
+  return (n == 0) ? x : ((x  n) | (x  (bits - n)));
+}
+
+unsigned m(unsigned a, unsigned b) {
+  if (a == 0)
+return 0;
+  else
+return a  b;
+}
+
+/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */
+/* { dg-final { cleanup-tree-dump phiopt1 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+int f(int a, int b) {
+  if (__builtin_expect(a == 0, 1)) return b;
+  return a + b;
+}
+
+// optab_handler can handle if(b==1) but not a/b
+// so we consider a/b too expensive.
+unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) {
+  if (b == 1)
+return a;
+  else
+return a / b;
+}
+
+/* { dg-final { scan-tree-dump-times goto  4 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 209353)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -140,20 +140,37 @@ static bool gate_hoist_loads (void);
x = PHI (CONST, a)

Gets replaced with:
  bb0:
  bb2:
t1 = a == CONST;
t2 = b  c;
t3 = t1  t2;
x = a;

+
+   It also replaces
+
+ bb0:
+   if (a != 0) goto bb1; else goto bb2;
+ bb1:
+   c = a + b;
+ bb2:
+   x = PHI c (bb1), b (bb0), ...;
+
+   with
+
+ bb0:
+   c = a + b;
+ bb2:
+   x = PHI c (bb0), ...;
+
ABS Replacement
---

This transformation, implemented in abs_replacement, replaces

  bb0:
if (a = 0) goto bb2; else goto bb1;
  bb1:
x = -a;
  bb2:
@@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;

   tmp = gimple_assign_rhs2 (def);
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;

   return false;
 }

+/* Returns true if ARG is a neutral element for operation CODE
+   on the RIGHT side.  */
+
+static bool
+neutral_element_p (tree_code code, tree arg, bool right)
+{
+  switch (code)
+{
+case PLUS_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  return integer_zerop (arg);
+
+case LROTATE_EXPR:
+case RROTATE_EXPR:
+case LSHIFT_EXPR:
+case RSHIFT_EXPR:
+case MINUS_EXPR:
+case POINTER_PLUS_EXPR:
+  return right  integer_zerop (arg);
+
+case MULT_EXPR:
+  return integer_onep (arg);
+
+case TRUNC_DIV_EXPR:
+case CEIL_DIV_EXPR

Re: [i386] define __SIZEOF_FLOAT128__

2014-04-23 Thread Marc Glisse

(Adding an i386 maintainer in Cc)
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html

On Sun, 13 Apr 2014, Marc Glisse wrote:


Hello,

some people like having a macro to test if a type is available 
(__SIZEOF_INT128__ for instance). This adds macros for __float80 and 
__float128. The types seem to be always available, so I didn't add any 
condition.


If you think this is a bad idea, please close the PR.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR preprocessor/56540
* config/i386/i386-c.c (ix86_target_macros): Define
__SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__.


--
Marc Glisse


Re: [c++] typeinfo for target types

2014-04-23 Thread Marc Glisse

On Wed, 23 Apr 2014, Richard Henderson wrote:


On 04/13/2014 01:41 AM, Marc Glisse wrote:

Hello,

this patch generates typeinfo for target types. On x86_64, it adds these 6
lines to nm -C libsupc++.a. A follow-up patch will be needed to export and
version those in the shared library.

+ V typeinfo for __float128
+ V typeinfo for __float128 const*
+ V typeinfo for __float128*
+ V typeinfo name for __float128
+ V typeinfo name for __float128 const*
+ V typeinfo name for __float128*

Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c).

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/c-family/
* c-common.c (registered_builtin_types): Make non-static.
* c-common.h (registered_builtin_types): Declare.
gcc/cp/
* rtti.c (emit_support_tinfo_1): New function, extracted from
emit_support_tinfos.
(emit_support_tinfos): Call it and iterate on registered_builtin_types.



This is causing aarch64 builds to break.


If it is causing too much trouble, we could ifdef out the last 2 lines of 
emit_support_tinfos and revert the libstdc++ changes (or even revert the 
whole thing).



Any c++ compilation aborts at


That's surprising, the code I touched is only ever supposed to run while 
compiling one file in libsupc++, if I understand correctly.



#0  fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303,
   function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__
   write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190
#1  0x007ce2b4 in write_builtin_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:2303
#2  0x007cc85c in write_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:1969
#3  0x007d4d98 in mangle_special_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df,
   code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569
#4  0x007d4dcc in mangle_typeinfo_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:3585
#5  0x0070618c in get_tinfo_decl (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:422
#6  0x00709ff0 in emit_support_tinfo_1 (
   bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:1485
#7  0x0070a344 in emit_support_tinfos ()
   at ../../git-rh/gcc/cp/rtti.c:1550

Presumably the backend needs to grow some mangling support for its builtins,


aarch64 has complicated builtins... __builtin_aarch64_simd_df uses 
double_aarch64_type_node which is not the same as double_type_node. I 
mostly looked at the x86 backend, so I didn't notice that aarch64 
registers a lot more builtins.



but in the meantime can we do something less drastic than abort?


Sounds good, but I am not sure how exactly. We could use a separate hook 
(register_builtin_type_for_typeinfo?) so back-ends have to explicitly say 
they want typeinfo, but it is ugly having to register types multiple 
times. We could add a parameter to the existing register_builtin_type 
saying whether we want typeinfo, but that means updating all back-ends. We 
could get the mangling functions to take a parameter that says whether 
errors should be fatal and skip generating the typeinfo when we can't 
mangle, but there is no convenient way to communicate this mangling 
failure (0 bytes written?).


Would mangling the aarch64 builtins be a lot of work? Did other platforms 
break as well?


Isn't this only really an issue if someone tries to access one of these 
types via typeinfo?


Yes.

--
Marc Glisse


Re: Optimize n?rotate(x,n):x

2014-04-23 Thread Marc Glisse

On Wed, 23 Apr 2014, Jan Hubicka wrote:


+  /* Now optimize (x != 0) ? x + y : y to just y.
+ The following condition is too restrictive, there can easily be
another
+ stmt in middle_bb, for instance a CONVERT_EXPR for the second
argument.  */
+  gimple assign = last_and_only_stmt (middle_bb);
+  if (!assign || gimple_code (assign) != GIMPLE_ASSIGN
+  || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
+  || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+  !POINTER_TYPE_P (TREE_TYPE (arg0
+return 0;
+
+  /* assign may call a libgcc routine, which is slow.  */
+  if (!is_cheap_stmt (assign)  is_cheap_stmt (cond))
+return 0;
+
+  /* If the special case has a high probability, keep it.  */
+  if (EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN)
+return 0;


Well, I would expect this transformation to be win always, + operation
is virtually for free.

Concerning profile - you must consider two cases.  If the profile is guessed
then the probability is most probably 71% to the non-zero case unless user
used an expect (since I would expect only PRED_OPCODE_NONEQUAL to match).
This is data dependent branch unless this sits in a loop and those are badly
predicted statically.

If the probability is read, then probabilities over (say) 10% and less than
90% means that the branch is likely not very well predictable (it still may
be on advanced architectures if it is predictable from context) and getting
rid of it is a good idea.

So if you want to disable the optimization for x being likely zero, I guess
testing that probability is over PROV_LIKELY is a resonable threshold - it
will handle both builtin_expect and the (very) badly predictable branches.
Maybe for FDO it should be higher, but we would need to do some research on
it that is probably not worth the effort.

The division transformation is bit different story, since cost of division
is more than cost of branch and the 50% threshold is a limit for one value
counter to be reliable.


Thank you for the comments. If I understand correctly:

- it is always ok to look at edge-probability (no need to check that the
  probabilities are available as Richard feared)
- PROB_EVEN is an ok threshold for division (not sure I understood your last
  sentence)
- for cheaper operations like addition, I should be less conservative and do
  the transformation always, or use a threshold of PROB_UNLIKELY.

Are there other operations than division (among those listed in
neutral_element_p or absorbing_element_p) that fall in the expensive
category? I guess there are some platforms where multiplication is
expensive, but few, and querying the target for the cost of operations
seems exagerated. So I would go with:

[move definition of code_def]
if ((code_def == TRUNC_DIV_EXPR || code_def == CEIL_DIV_EXPR
 || code_def == FLOOR_DIV_EXPR || code_def == ROUND_DIV_EXPR
 || code_def == EXACT_DIV_EXPR)
 EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN)
  return 0;

(and change the testsuite example with __builtin_expect to use division)

or (my favorite):

[move definition of code_def]
int threshold = PROB_UNLIKELY;
if (code_def == TRUNC_DIV_EXPR || code_def == CEIL_DIV_EXPR
|| code_def == FLOOR_DIV_EXPR || code_def == ROUND_DIV_EXPR
|| code_def == EXACT_DIV_EXPR)
  threshold = PROB_EVEN;
if (EDGE_PRED (middle_bb, 0)-probability  threshold)
  return 0;


Is that ok, after re-testing?

--
Marc Glisse


Re: [i386] define __SIZEOF_FLOAT128__

2014-04-23 Thread Marc Glisse

On Wed, 23 Apr 2014, H.J. Lu wrote:


On Wed, Apr 23, 2014 at 11:48 AM, Marc Glisse marc.gli...@inria.fr wrote:

(Adding an i386 maintainer in Cc)
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html


On Sun, 13 Apr 2014, Marc Glisse wrote:


Hello,

some people like having a macro to test if a type is available
(__SIZEOF_INT128__ for instance). This adds macros for __float80 and
__float128. The types seem to be always available, so I didn't add any
condition.

If you think this is a bad idea, please close the PR.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR preprocessor/56540
* config/i386/i386-c.c (ix86_target_macros): Define
__SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__.




For __SIZEOF_FLOAT80__, you should check TARGET_128BIT_LONG_DOUBLE
instead of TARGET_64BIT.


Good point, thanks! It now matches i386-modes.def. Is this version (same 
changelog) ok?


--
Marc GlisseIndex: gcc/config/i386/i386-c.c
===
--- gcc/config/i386/i386-c.c(revision 209721)
+++ gcc/config/i386/i386-c.c(working copy)
@@ -511,20 +511,27 @@ ix86_target_macros (void)
 
   if (!TARGET_80387)
 cpp_define (parse_in, _SOFT_FLOAT);
 
   if (TARGET_LONG_DOUBLE_64)
 cpp_define (parse_in, __LONG_DOUBLE_64__);
 
   if (TARGET_LONG_DOUBLE_128)
 cpp_define (parse_in, __LONG_DOUBLE_128__);
 
+  if (TARGET_128BIT_LONG_DOUBLE)
+cpp_define (parse_in, __SIZEOF_FLOAT80__=16);
+  else
+cpp_define (parse_in, __SIZEOF_FLOAT80__=12);
+
+  cpp_define (parse_in, __SIZEOF_FLOAT128__=16);
+
   cpp_define_formatted (parse_in, __ATOMIC_HLE_ACQUIRE=%d, IX86_HLE_ACQUIRE);
   cpp_define_formatted (parse_in, __ATOMIC_HLE_RELEASE=%d, IX86_HLE_RELEASE);
 
   ix86_target_macros_internal (ix86_isa_flags,
   ix86_arch,
   ix86_tune,
   ix86_fpmath,
   cpp_define);
 }
 


Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse

2014-04-22 Thread Marc Glisse

On Mon, 21 Apr 2014, Richard Henderson wrote:


On 04/21/2014 01:19 PM, Bill Schmidt wrote:

+  if (GET_CODE (trueop0) == VEC_SELECT
+  GET_MODE (XEXP (trueop0, 0)) == mode)
+   {
+ rtx op0_subop1 = XEXP (trueop0, 1);
+ gcc_assert (GET_CODE (op0_subop1) == PARALLEL);
+ gcc_assert (XVECLEN (trueop1, 0) == GET_MODE_NUNITS (mode));
+
+ /* Apply the outer ordering vector to the inner one.  (The inner
+ordering vector is expressly permitted to be of a different
+length than the outer one.)  If the result is { 0, 1, ..., n-1 }
+then the two VEC_SELECTs cancel.  */
+ for (int i = 0; i  XVECLEN (trueop1, 0); ++i)
+   {
+ rtx x = XVECEXP (trueop1, 0, i);
+ gcc_assert (CONST_INT_P (x));
+ rtx y = XVECEXP (op0_subop1, 0, INTVAL (x));
+ gcc_assert (CONST_INT_P (y));


In two places you're asserting that you've got a constant permutation.  Surely
there should be a non-assertion check and graceful exit for either select to be
a variable permutation.


Note that in the case where trueop0 is a CONST_VECTOR, we already check 
each element of trueop1:


  gcc_assert (CONST_INT_P (x));

In the case where the result is a scalar, we also have:

  gcc_assert (CONST_INT_P (XVECEXP (trueop1, 0, 0)));

so we will have other issues if something ever creates a variable 
vec_select. Not that a graceful exit will hurt of course.


--
Marc Glisse


Re: version typeinfo for 128bit types

2014-04-22 Thread Marc Glisse

On Tue, 22 Apr 2014, Jonathan Wakely wrote:


On 14 April 2014 11:28, Jonathan Wakely wrote:

On 14 April 2014 10:39, Marc Glisse wrote:


PR libstdc++/43622
* config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols.
* config/abi/pre/gnu-versioned-namespace.ver: New symbols.
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise.


I'd like to wait until 4.9.0 has been released before adding new
versions to the library, to avoid unnecessary differences between
trunk and the 4.9.0 release candidate(s).


I was about to say the patch is OK for trunk now, but noticed that the
new CXXABI_1.3.9 block is added after the CXXABI_TM one, not after
CXXABI_1.3.8 ... was that intentional?


I don't have any opinion, I'll move it just after 1.3.8 if that seems more 
logical to you.


--
Marc Glisse


Re: version typeinfo for 128bit types

2014-04-22 Thread Marc Glisse

Hello,

as written in the PR, my patch seems wrong for platforms like powerpc that 
already had the __float128 typeinfo for long double with a different 
version. The following patch regtested fine on x86_64, and a hackish 
cross-build shows that float128.ver is ignored on powerpc (good).


2014-04-23  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* config/abi/pre/float128.ver: New file.
* config/abi/pre/gnu.ver (CXXABI_1.3.9): Move __float128 typeinfo to
the new file.
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* configure.ac: Use float128.ver when relevant.
* configure: Regenerate.

--
Marc GlisseIndex: libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
===
--- libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt  
(revision 209658)
+++ libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt  
(working copy)
@@ -2514,20 +2514,21 @@ FUNC:atomic_flag_test_and_set_explicit@@
 OBJECT:0:CXXABI_1.3
 OBJECT:0:CXXABI_1.3.1
 OBJECT:0:CXXABI_1.3.2
 OBJECT:0:CXXABI_1.3.3
 OBJECT:0:CXXABI_1.3.4
 OBJECT:0:CXXABI_1.3.5
 OBJECT:0:CXXABI_1.3.6
 OBJECT:0:CXXABI_1.3.7
 OBJECT:0:CXXABI_1.3.8
 OBJECT:0:CXXABI_1.3.9
+OBJECT:0:CXXABI_FLOAT128_1.3.9
 OBJECT:0:CXXABI_TM_1
 OBJECT:0:GLIBCXX_3.4
 OBJECT:0:GLIBCXX_3.4.1
 OBJECT:0:GLIBCXX_3.4.10
 OBJECT:0:GLIBCXX_3.4.11
 OBJECT:0:GLIBCXX_3.4.12
 OBJECT:0:GLIBCXX_3.4.13
 OBJECT:0:GLIBCXX_3.4.14
 OBJECT:0:GLIBCXX_3.4.15
 OBJECT:0:GLIBCXX_3.4.16
@@ -2618,21 +2619,21 @@ OBJECT:16:_ZTISt16nested_exception@@CXXA
 OBJECT:16:_ZTISt8ios_base@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9exception@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9time_base@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9type_info@@GLIBCXX_3.4
 OBJECT:16:_ZTIa@@CXXABI_1.3
 OBJECT:16:_ZTIb@@CXXABI_1.3
 OBJECT:16:_ZTIc@@CXXABI_1.3
 OBJECT:16:_ZTId@@CXXABI_1.3
 OBJECT:16:_ZTIe@@CXXABI_1.3
 OBJECT:16:_ZTIf@@CXXABI_1.3
-OBJECT:16:_ZTIg@@CXXABI_1.3.9
+OBJECT:16:_ZTIg@@CXXABI_FLOAT128_1.3.9
 OBJECT:16:_ZTIh@@CXXABI_1.3
 OBJECT:16:_ZTIi@@CXXABI_1.3
 OBJECT:16:_ZTIj@@CXXABI_1.3
 OBJECT:16:_ZTIl@@CXXABI_1.3
 OBJECT:16:_ZTIm@@CXXABI_1.3
 OBJECT:16:_ZTIn@@CXXABI_1.3.5
 OBJECT:16:_ZTIo@@CXXABI_1.3.5
 OBJECT:16:_ZTIs@@CXXABI_1.3
 OBJECT:16:_ZTIt@@CXXABI_1.3
 OBJECT:16:_ZTIv@@CXXABI_1.3
@@ -3119,21 +3120,21 @@ OBJECT:2:_ZNSt10ctype_base5printE@@GLIBC
 OBJECT:2:_ZNSt10ctype_base5punctE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base5spaceE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base5upperE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base6xdigitE@@GLIBCXX_3.4
 OBJECT:2:_ZTSa@@CXXABI_1.3
 OBJECT:2:_ZTSb@@CXXABI_1.3
 OBJECT:2:_ZTSc@@CXXABI_1.3
 OBJECT:2:_ZTSd@@CXXABI_1.3
 OBJECT:2:_ZTSe@@CXXABI_1.3
 OBJECT:2:_ZTSf@@CXXABI_1.3
-OBJECT:2:_ZTSg@@CXXABI_1.3.9
+OBJECT:2:_ZTSg@@CXXABI_FLOAT128_1.3.9
 OBJECT:2:_ZTSh@@CXXABI_1.3
 OBJECT:2:_ZTSi@@CXXABI_1.3
 OBJECT:2:_ZTSj@@CXXABI_1.3
 OBJECT:2:_ZTSl@@CXXABI_1.3
 OBJECT:2:_ZTSm@@CXXABI_1.3
 OBJECT:2:_ZTSn@@CXXABI_1.3.9
 OBJECT:2:_ZTSo@@CXXABI_1.3.9
 OBJECT:2:_ZTSs@@CXXABI_1.3
 OBJECT:2:_ZTSt@@CXXABI_1.3
 OBJECT:2:_ZTSv@@CXXABI_1.3
@@ -3153,41 +3154,41 @@ OBJECT:32:_ZTIPKDe@@CXXABI_1.3.4
 OBJECT:32:_ZTIPKDf@@CXXABI_1.3.4
 OBJECT:32:_ZTIPKDi@@CXXABI_1.3.3
 OBJECT:32:_ZTIPKDn@@CXXABI_1.3.5
 OBJECT:32:_ZTIPKDs@@CXXABI_1.3.3
 OBJECT:32:_ZTIPKa@@CXXABI_1.3
 OBJECT:32:_ZTIPKb@@CXXABI_1.3
 OBJECT:32:_ZTIPKc@@CXXABI_1.3
 OBJECT:32:_ZTIPKd@@CXXABI_1.3
 OBJECT:32:_ZTIPKe@@CXXABI_1.3
 OBJECT:32:_ZTIPKf@@CXXABI_1.3
-OBJECT:32:_ZTIPKg@@CXXABI_1.3.9
+OBJECT:32:_ZTIPKg@@CXXABI_FLOAT128_1.3.9
 OBJECT:32:_ZTIPKh@@CXXABI_1.3
 OBJECT:32:_ZTIPKi@@CXXABI_1.3
 OBJECT:32:_ZTIPKj@@CXXABI_1.3
 OBJECT:32:_ZTIPKl@@CXXABI_1.3
 OBJECT:32:_ZTIPKm@@CXXABI_1.3
 OBJECT:32:_ZTIPKn@@CXXABI_1.3.5
 OBJECT:32:_ZTIPKo@@CXXABI_1.3.5
 OBJECT:32:_ZTIPKs@@CXXABI_1.3
 OBJECT:32:_ZTIPKt@@CXXABI_1.3
 OBJECT:32:_ZTIPKv@@CXXABI_1.3
 OBJECT:32:_ZTIPKw@@CXXABI_1.3
 OBJECT:32:_ZTIPKx@@CXXABI_1.3
 OBJECT:32:_ZTIPKy@@CXXABI_1.3
 OBJECT:32:_ZTIPa@@CXXABI_1.3
 OBJECT:32:_ZTIPb@@CXXABI_1.3
 OBJECT:32:_ZTIPc@@CXXABI_1.3
 OBJECT:32:_ZTIPd@@CXXABI_1.3
 OBJECT:32:_ZTIPe@@CXXABI_1.3
 OBJECT:32:_ZTIPf@@CXXABI_1.3
-OBJECT:32:_ZTIPg@@CXXABI_1.3.9
+OBJECT:32:_ZTIPg@@CXXABI_FLOAT128_1.3.9
 OBJECT:32:_ZTIPh@@CXXABI_1.3
 OBJECT:32:_ZTIPi@@CXXABI_1.3
 OBJECT:32:_ZTIPj@@CXXABI_1.3
 OBJECT:32:_ZTIPl@@CXXABI_1.3
 OBJECT:32:_ZTIPm@@CXXABI_1.3
 OBJECT:32:_ZTIPn@@CXXABI_1.3.5
 OBJECT:32:_ZTIPo@@CXXABI_1.3.5
 OBJECT:32:_ZTIPs@@CXXABI_1.3
 OBJECT:32:_ZTIPt@@CXXABI_1.3
 OBJECT:32:_ZTIPv@@CXXABI_1.3
@@ -3228,21 +3229,21 @@ OBJECT:39:_ZTSSt13basic_filebufIwSt11cha
 OBJECT:39:_ZTSSt13basic_fstreamIcSt11char_traitsIcEE@@GLIBCXX_3.4
 OBJECT:39:_ZTSSt13basic_fstreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
 OBJECT:39:_ZTSSt13basic_istreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
 OBJECT:39:_ZTSSt13basic_ostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
 OBJECT:3:_ZTSPa@@CXXABI_1.3
 OBJECT:3:_ZTSPb@@CXXABI_1.3
 OBJECT:3:_ZTSPc@@CXXABI_1.3
 OBJECT:3:_ZTSPd@@CXXABI_1.3

Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse

2014-04-21 Thread Marc Glisse

On Mon, 21 Apr 2014, Bill Schmidt wrote:


Note that it would be possible to do a more general transformation here,
in which any vec_select feeding another could be replaced by a
vec_select performing the composite function of the other two.  I have
not done this because I am unaware of this situation arising in
practice.  If it's desirable, I can extend the patch in this direction.


It does arise, but I think it isn't done because not all permutations are 
(optimally) supported by all targets.



Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 209516)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3673,6 +3673,34 @@ simplify_binary_operation_1 (enum rtx_code code, e
}
}

+  /* If we have two nested selects that are inverses of each
+other, replace them with the source operand.  */
+  if (GET_CODE (trueop0) == VEC_SELECT)
+   {
+ enum machine_mode reg_mode = GET_MODE (XEXP (trueop0, 0));
+ rtx op0_subop1 = XEXP (trueop0, 1);
+ gcc_assert (VECTOR_MODE_P (reg_mode));
+ gcc_assert (GET_MODE_INNER (mode) == GET_MODE_INNER (reg_mode));
+ gcc_assert (GET_CODE (op0_subop1) == PARALLEL);
+
+ if (XVECLEN (trueop1, 0) == XVECLEN (op0_subop1, 0))
+   {
+ /* Apply the second ordering vector to the first.
+If the result is { 0, 1, ..., n-1 } then the
+two VEC_SELECTs cancel.  */
+ for (int i = 0; i  XVECLEN (trueop1, 0); ++i)
+   {
+ rtx x = XVECEXP (trueop1, 0, i);
+ gcc_assert (CONST_INT_P (x));
+ rtx y = XVECEXP (op0_subop1, 0, INTVAL (x));
+ gcc_assert (CONST_INT_P (y));
+ if (i != INTVAL (y))
+   return 0;
+   }
+ return XEXP (trueop0, 0);
+   }
+   }


I may have missed it, but don't you want to check that what you are 
returning has the right mode/length (or generate the obvious vec_select 
otherwise)? I don't know if any platform has such constructions (probably 
not), but in principle you could start from a vector of size 4, extract 
{1,0} from it, extract {1,0} from that, and you don't want to return the 
initial vector as is. On the other hand, I don't think you really care 
whether trueop1 is smaller than op0_subop1. Starting from a vector of size 
2, extracting {1,0,1,0} then {3,0} gives the initial vector just fine.


--
Marc Glisse


Re: [build] PR 43538: Don't overwrite CXXFLAGS_FOR_TARGET in config/mt-gnu

2014-04-18 Thread Marc Glisse

Ping
http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01480.html

On Thu, 23 Jan 2014, Marc Glisse wrote:


Hello,

although setting CFLAGS_FOR_TARGET before compiling gcc works fine, 
CXXFLAGS_FOR_TARGET is ignored. I don't see any good reason for that.


I tested the patch by doing a regular bootstrap+testsuite on 
x86_64-unknown-linux-gnu. I also did a non-bootstrap build where I set 
CXXFLAGS_FOR_TARGET and checked that it now propagates to libstdc++ and 
others.


config/ChangeLog:

2014-01-23  Marc Glisse  marc.gli...@inria.fr

PR target/43538
* mt-gnu: Don't reset CXXFLAGS_FOR_TARGET.


--
Marc GlisseIndex: config/mt-gnu
===
--- config/mt-gnu   (revision 209514)
+++ config/mt-gnu   (working copy)
@@ -1 +1 @@
-CXXFLAGS_FOR_TARGET = $(CXXFLAGS) -D_GNU_SOURCE
+CXXFLAGS_FOR_TARGET += -D_GNU_SOURCE


Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-18 Thread Marc Glisse

On Fri, 18 Apr 2014, Steven Bosscher wrote:


On Fri, Apr 18, 2014 at 7:30 AM, Marek Polacek wrote:

+ if (TREE_CODE (type) == BOOLEAN_TYPE
+ || exp_code == TRUTH_ANDIF_EXPR
+ || exp_code == TRUTH_AND_EXPR
+ || exp_code == TRUTH_ORIF_EXPR
+ || exp_code == TRUTH_OR_EXPR
+ || exp_code == TRUTH_XOR_EXPR
+ || exp_code == TRUTH_NOT_EXPR
+ || exp_code == EQ_EXPR
+ || exp_code == NE_EXPR
+ || exp_code == LE_EXPR
+ || exp_code == GE_EXPR
+ || exp_code == LT_EXPR
+ || exp_code == GT_EXPR)


Is there a TREE_CODE_CLASS or a #define for this?


truth_value_p

--
Marc Glisse


Re: calloc = malloc + memset

2014-04-18 Thread Marc Glisse
., then there is no
zero termination.  Sure, calling strlen or strchr etc. on the result
is undefined behavior, just wondering if it isn't going to break anything.
Though, the result is zero length and thus any dereferencing would be a bug,
so perhaps we are fine.


You know the pass better than I do... I don't think the pass does anything 
like: all 0-length strings are equivalent, let's replace one by another 
one, so I think we should be ok, but I don't mind treating calloc like 
malloc here if you think it is too risky, I don't think people call 
strlen(calloc(a,b)) very often, and it would let me remove the don't use 
si-length comment.



Updated and tested on x86_64-linux-gnu.

2014-04-18  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (get_string_length): Ignore malloc.
(handle_builtin_malloc, handle_builtin_memset): New functions.
(strlen_optimize_stmt): Call them.
* passes.def: Move strlen after loop+dom.
gcc/testsuite/
* g++.dg/tree-ssa/calloc.C: New testcase.
* gcc.dg/tree-ssa/calloc-1.c: Likewise.
* gcc.dg/tree-ssa/calloc-2.c: Likewise.
* gcc.dg/strlenopt-9.c: Adapt.

--
Marc GlisseIndex: gcc/passes.def
===
--- gcc/passes.def  (revision 209516)
+++ gcc/passes.def  (working copy)
@@ -176,21 +176,20 @@ along with GCC; see the file COPYING3.
 DOM and erroneous path isolation should be due to degenerate PHI nodes.
 So rather than run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
-  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_ccp);
   /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_cse_sincos);
   NEXT_PASS (pass_optimize_bswap);
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
   NEXT_PASS (pass_asan);
@@ -235,20 +234,21 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_dominator);
   /* The only const/copy propagation opportunities left after
 DOM should be due to degenerate PHI nodes.  So rather than
 run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
+  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_vrp);
   NEXT_PASS (pass_cd_dce);
   NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
   NEXT_PASS (pass_fold_builtins);
   NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tail_calls);
   NEXT_PASS (pass_rename_ssa_copies);
Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C
===
--- gcc/testsuite/g++.dg/tree-ssa/calloc.C  (revision 0)
+++ gcc/testsuite/g++.dg/tree-ssa/calloc.C  (working copy)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -fdump-tree-optimized } */
+
+typedef __SIZE_TYPE__ size_t;
+inline void* operator new(size_t, void* p) throw() { return p; }
+
+typedef void (*handler_t)(void);
+extern handler_t get_handle();
+
+inline void* operator new(size_t sz)
+{
+  void *p;
+
+  if (sz == 0)
+sz = 1;
+
+  while ((p = __builtin_malloc (sz)) == 0)
+{
+  handler_t handler = get_handle ();
+  if (! handler)
+throw 42;
+  handler ();
+}
+  return p;
+}
+
+struct vect {
+  int *start, *end;
+  vect(size_t n) {
+start = end = 0;
+if (n  (size_t)-1 / sizeof(int))
+  throw 33;
+if (n != 0)
+  start = static_castint* (operator new (n * sizeof(int)));
+end = start + n;
+int *p = start;
+for (size_t l = n; l  0; --l, ++p)
+  *p = 0;
+  }
+};
+
+void f (void *p, int n)
+{
+  new (p) vect(n);
+}
+
+/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
+/* { dg-final { scan-tree-dump-not malloc optimized } } */
+/* { dg-final { scan-tree-dump-not memset optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/strlenopt-9.c
===
--- gcc/testsuite/gcc.dg/strlenopt-9.c  (revision 209516)
+++ gcc/testsuite/gcc.dg/strlenopt-9.c  (working copy)
@@ -11,21 +11,21 @@ fn1 (int r)
  optimized away.  */
   return strchr (p, '\0');
 }
 
 __attribute__((noinline, noclone)) size_t
 fn2 (int r)
 {
   char *p, q[10];
   strcpy

Re: [PATCH] Fix PR60849

2014-04-17 Thread Marc Glisse

On Thu, 17 Apr 2014, Richard Biener wrote:


This fixes PR60849 by properly rejecting non-boolean typed
comparisons from valid_gimple_rhs_p so they go through the
gimplification paths.


Could you also accept vector comparisons please?

--
Marc Glisse


Re: [patch] change specific int128 - generic intN

2014-04-17 Thread Marc Glisse

On Tue, 15 Apr 2014, DJ Delorie wrote:


I wasn't sure what to do with that array, since it was static and
couldn't have empty slots in them like the arrays in tree.h.  Also,
do we need to have *every* type in that list?  What's the rule for
whether a type gets installed there or not?  The comment says
guaranteed to be in the runtime support but does that mean for this
particular build (wrt multilibs) as not all intN types are guaranteed
(even the int128 types were not guaranteed to be supported before my
patch).  In other parts of the patch, just taking out the special case
for __int128 was sufficient to do the right thing for all __intN
types.


You need someone who understands this better than me (ask Jason). To be 
able to throw/catch a type, you need some typeinfo symbols. The front-end 
generates that for classes when they are defined. For fundamental types, 
it assumes libsupc++ will provide it, and the function you are modifying 
is the one generating libsupc++ (I am surprised your patch didn't cause 
any failure on x64_64, at least in abi_check). We need to generate the 
typeinfo for __intN, either in libsupc++, or in each TU, and since both 
cases will require code, I assume libsupc++ is preferable.



I can certainly put the intN types in there, but note that it would
mean regenerating the fundamentals[] array at runtime to include those
types which are supported at the time.


After the patch I linked, it should just mean calling the helper function 
on your new types, no need to touch the array.



Do the entries in the array need to be in a particular order?


No, any random order would do.

--
Marc Glisse


Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marc Glisse

On Fri, 18 Apr 2014, Marek Polacek wrote:


This patch implements a new warning that warns when controlling
expression of a switch has boolean value.  (Intentionally I don't
warn if the controlling expression is (un)signed:1 bit-field.)
I guess the question is if this should be enabled by default or
deserves some new warning option.  Since clang does the former,
I did it too and currently this warning is enabled by default.


It can be enabled by -Wsome-name which is itself enabled by default but
at least gives the possibility to use -Wno-some-name, -Werror=some-name,
etc. No? I believe Manuel insists regularly that no new warning should
use 0 (and old ones should progressively lose it).

--
Marc Glisse


Re: C++ PATCH for c++/51747 (list-initialization from same type)

2014-04-15 Thread Marc Glisse

On Tue, 15 Apr 2014, Jason Merrill wrote:


On 04/14/2014 06:02 PM, Marc Glisse wrote:

shouldn't the same also apply if VECTOR_TYPE_P (type), not just for
CLASS_TYPE_P (type)?


Sure.  Do you want to make that change?


I can add || VECTOR_TYPE_P (type), yes, but I thought you might have ideas 
about other cases that might have been forgotten, maybe arrays or 
something (I didn't have time to test any further), and thus on what the 
right test should be. If it is just vectors I'll prepare a patch with a 
simple testcase.


--
Marc Glisse


Re: [v3] Slightly improve operator new

2014-04-15 Thread Marc Glisse

Ping
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00051.html


On Sun, 2 Mar 2014, Marc Glisse wrote:


Hello,

inlining operator new (with LTO or otherwise), I noticed that it has a 
complicated implementation, which makes it hard to use this inlined code for 
optimizations. This patch does two things:


1) there are 2 calls to malloc, I am turning them into just one. At -Os, it 
does not change the generated code (RTL optimizers manage to merge the calls 
to malloc). At other levels (-O2, -O3, and especially with -g) it gives a 
smaller object file. And with just one malloc, some optimizations become much 
easier (see my recent calloc patch for instance).


2) malloc is predicted to return null 19 times out of 20 because of the loop 
(that didn't change with the patch), so I am adding __builtin_expect to let 
gcc optimize the fast path.


Further discussion:

a) I didn't add __builtin_expect for the test (sz == 0), it didn't change the 
generated code in my limited test. I was wondering if this test is necessary 
(new doesn't seem to ever call operator new(0)) or could be moved to operator 
new[] (new type[0] does call operator new[](0)), but since one can call 
operator new directly, it has to be protected indeed, so let's forget this 
point ;-)
(too bad malloc is replacable, so we can't use the fact that glibc already 
does the right thing)


b) I have a bit of trouble parsing the standard. Is the nothrow operator new 
supposed to call the regular operator new? In particular, if a user replaces 
only the throwing operator new, should the nothrow operator new automatically 
call that function? That's not what we are currently doing (and it would be a 
perf regression).


Required behavior: Return a non-null pointer to suitably aligned storage 
(3.7.4), or else return a null pointer. This nothrow version of operator new 
returns a pointer obtained as if acquired from the (possibly replaced) 
ordinary version. This requirement is binding on a replacement version of 
this function.


Default behavior: Calls operator new(size). If the call returns normally, 
returns the result of that call. Otherwise, returns a null pointer.




Passes bootstrap+testsuite on x86_64-linux-gnu. Stage 1?

2014-03-03  Marc Glisse  marc.gli...@inria.fr

	* libsupc++/new_op.cc: Factor the calls to malloc, use 
__builtin_expect.

* libsupc++/new_opnt.cc: Likewise.




--
Marc Glisse


Re: calloc = malloc + memset

2014-04-15 Thread Marc Glisse
Let me ping this. There's no hurry, but it may have got lost with 4.9 
approaching.

http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01205.html


On Sun, 23 Mar 2014, Marc Glisse wrote:


On Mon, 3 Mar 2014, Richard Biener wrote:


That's a bit much of ad-hoc pattern-matching ... wouldn't be
p = malloc (n);
memset (p, 0, n);

transform better suited to the strlen opt pass?  After all that tracks
what 'string' is associated with a SSA name pointer through
arbitrary satements using a lattice.


Like this? I had to move the strlen pass after the loop passes (and
after dom or everything was too dirty) but long enough before the end
(some optimizations are necessary after strlen). As a bonus, one more
strlen is optimized in the current testcases :-)

Running the pass twice would be another option I guess (it would require
implementing the clone method), but without a testcase showing it is
needed...

Passes bootstrap+testsuite on x86_64-linux-gnu.

2014-03-23  Marc Glisse  marc.gli...@inria.fr

   PR tree-optimization/57742
gcc/
   * tree-ssa-strlen.c (get_string_length): Ignore malloc.
   (handle_builtin_malloc, handle_builtin_memset): New functions.
(strlen_optimize_stmt): Call them.
* passes.def: Move strlen after loop+dom.
gcc/testsuite/
   * g++.dg/tree-ssa/calloc.C: New testcase.
   * gcc.dg/tree-ssa/calloc-1.c: Likewise.
   * gcc.dg/tree-ssa/calloc-2.c: Likewise.
* gcc.dg/strlenopt-9.c: Adapt.


--
Marc Glisse


Re: C++ PATCH for c++/51747 (list-initialization from same type)

2014-04-15 Thread Marc Glisse

On Tue, 15 Apr 2014, Jason Merrill wrote:

It's just vectors, because they're an extension; the patch I checked in 
covered the standard language.


Like this? (regtested on x86_64-linux-gnu)

2014-04-16  Marc Glisse  marc.gli...@inria.fr

gcc/cp/
* decl.c (reshape_init_r): Handle a single element of vector type.
gcc/testsuite/
* g++.dg/cpp0x/initlist-vect.C: New file.

--
Marc GlisseIndex: gcc/cp/decl.c
===
--- gcc/cp/decl.c   (revision 209434)
+++ gcc/cp/decl.c   (working copy)
@@ -5400,21 +5400,21 @@ reshape_init_r (tree type, reshape_iter
maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS);
}
 
   d-cur++;
   return init;
 }
 
   /* If T is a class type and the initializer list has a single element of
  type cv U, where U is T or a class derived from T, the object is
  initialized from that element.  Even if T is an aggregate.  */
-  if (cxx_dialect = cxx11  CLASS_TYPE_P (type)
+  if (cxx_dialect = cxx11  (CLASS_TYPE_P (type) || VECTOR_TYPE_P (type))
first_initializer_p
d-end - d-cur == 1
reference_related_p (type, TREE_TYPE (init)))
 {
   d-cur++;
   return init;
 }
 
   /* [dcl.init.aggr]
 
Index: gcc/testsuite/g++.dg/cpp0x/initlist-vect.C
===
--- gcc/testsuite/g++.dg/cpp0x/initlist-vect.C  (revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/initlist-vect.C  (working copy)
@@ -0,0 +1,6 @@
+// { dg-do compile { target c++11 } }
+
+typedef float X __attribute__ ((vector_size (4 * sizeof (float;
+
+X x;
+X x2{x};


Re: [PATCH 1/3] libstdc++: Add time_get::get support.

2014-04-15 Thread Marc Glisse

On Tue, 15 Apr 2014, Paolo Carlini wrote:

Anyway, the real issue is indeed that implementing those bits requires a new 
virtual function, and that would break the ABI.


What is the status of the ABI half-break plan (abi_tag and such), 
necessary to get the remaining pieces of C++11?


--
Marc Glisse


version typeinfo for 128bit types

2014-04-14 Thread Marc Glisse

Hello,

this is a follow-up for this patch:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html

once committed, g++ will generate typeinfo for __float128, and it needs 
versioning. While there, I noticed that __int128 has typeinfo but not 
typeinfo name, so I am adding it. I manually checked that the new 
symbols were exactly the 12 I expected, with the new version number.


I did not test the gnu-versioned-namespace version.

I manually updated baseline for x86_64. It is awfully inconvenient to do. 
I was expecting make new-abi-baseline to generate it for me, but it 
gives me plenty of extra symbols compared to the current one. Some random 
examples:


FUNC:_ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_PKS3_@@GLIBCXX_3.4
FUNC:_ZSt8distanceIPKcENSt15iterator_traitsIT_E15difference_typeES3_S3_@@GLIBCXX_3.4
FUNC:_ZSt19__iterator_categoryIPKmENSt15iterator_traitsIT_E17iterator_categoryERKS3_@@GLIBCXX_3.4
FUNC:_ZSt13__check_facetISt7codecvtIwc11__mbstate_tEERKT_PS4_@@GLIBCXX_3.4
FUNC:_ZSt13__check_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcRKT_PS7_@@GLIBCXX_3.4
FUNC:_ZNSt9exceptionC1Ev@@GLIBCXX_3.4
FUNC:_ZNSt8iteratorISt18input_iterator_tagclPcRcEC1Ev@@GLIBCXX_3.4
FUNC:_ZNSt8ios_base4setfESt13_Ios_FmtflagsS0_@@GLIBCXX_3.4
FUNC:_ZNSt7complexIfEC1Eff@@GLIBCXX_3.4
FUNC:_ZNSt6chrono13duration_castINS_8durationIlSt5ratioILl1ELl10lS2_ILl1ELl1NSt9enable_ifIXsrNS_13__is_durationIT_EE5valueES8_E4typeERKNS1_IT0_T1_EE@@GLIBCXX_3.4
FUNC:_ZNSt20bad_array_new_lengthC2Ev@@CXXABI_1.3.8
FUNC:_ZNSt14numeric_limitsIdE8infinityEv@@GLIBCXX_3.4
FUNC:_ZN10__cxxabiv117__class_type_info16__dyncast_resultC1Ei@@CXXABI_1.3

etc.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-14  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
* config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols.
* config/abi/pre/gnu-versioned-namespace.ver: New symbols.
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise.

--
Marc GlisseIndex: config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
===
--- config/abi/post/x86_64-linux-gnu/baseline_symbols.txt   (revision 
209354)
+++ config/abi/post/x86_64-linux-gnu/baseline_symbols.txt   (working copy)
@@ -2513,20 +2513,21 @@ FUNC:atomic_flag_clear_explicit@@GLIBCXX
 FUNC:atomic_flag_test_and_set_explicit@@GLIBCXX_3.4.11
 OBJECT:0:CXXABI_1.3
 OBJECT:0:CXXABI_1.3.1
 OBJECT:0:CXXABI_1.3.2
 OBJECT:0:CXXABI_1.3.3
 OBJECT:0:CXXABI_1.3.4
 OBJECT:0:CXXABI_1.3.5
 OBJECT:0:CXXABI_1.3.6
 OBJECT:0:CXXABI_1.3.7
 OBJECT:0:CXXABI_1.3.8
+OBJECT:0:CXXABI_1.3.9
 OBJECT:0:CXXABI_TM_1
 OBJECT:0:GLIBCXX_3.4
 OBJECT:0:GLIBCXX_3.4.1
 OBJECT:0:GLIBCXX_3.4.10
 OBJECT:0:GLIBCXX_3.4.11
 OBJECT:0:GLIBCXX_3.4.12
 OBJECT:0:GLIBCXX_3.4.13
 OBJECT:0:GLIBCXX_3.4.14
 OBJECT:0:GLIBCXX_3.4.15
 OBJECT:0:GLIBCXX_3.4.16
@@ -2617,20 +2618,21 @@ OBJECT:16:_ZTISt16nested_exception@@CXXA
 OBJECT:16:_ZTISt8ios_base@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9exception@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9time_base@@GLIBCXX_3.4
 OBJECT:16:_ZTISt9type_info@@GLIBCXX_3.4
 OBJECT:16:_ZTIa@@CXXABI_1.3
 OBJECT:16:_ZTIb@@CXXABI_1.3
 OBJECT:16:_ZTIc@@CXXABI_1.3
 OBJECT:16:_ZTId@@CXXABI_1.3
 OBJECT:16:_ZTIe@@CXXABI_1.3
 OBJECT:16:_ZTIf@@CXXABI_1.3
+OBJECT:16:_ZTIg@@CXXABI_1.3.9
 OBJECT:16:_ZTIh@@CXXABI_1.3
 OBJECT:16:_ZTIi@@CXXABI_1.3
 OBJECT:16:_ZTIj@@CXXABI_1.3
 OBJECT:16:_ZTIl@@CXXABI_1.3
 OBJECT:16:_ZTIm@@CXXABI_1.3
 OBJECT:16:_ZTIn@@CXXABI_1.3.5
 OBJECT:16:_ZTIo@@CXXABI_1.3.5
 OBJECT:16:_ZTIs@@CXXABI_1.3
 OBJECT:16:_ZTIt@@CXXABI_1.3
 OBJECT:16:_ZTIv@@CXXABI_1.3
@@ -3117,25 +3119,28 @@ OBJECT:2:_ZNSt10ctype_base5printE@@GLIBC
 OBJECT:2:_ZNSt10ctype_base5punctE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base5spaceE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base5upperE@@GLIBCXX_3.4
 OBJECT:2:_ZNSt10ctype_base6xdigitE@@GLIBCXX_3.4
 OBJECT:2:_ZTSa@@CXXABI_1.3
 OBJECT:2:_ZTSb@@CXXABI_1.3
 OBJECT:2:_ZTSc@@CXXABI_1.3
 OBJECT:2:_ZTSd@@CXXABI_1.3
 OBJECT:2:_ZTSe@@CXXABI_1.3
 OBJECT:2:_ZTSf@@CXXABI_1.3
+OBJECT:2:_ZTSg@@CXXABI_1.3.9
 OBJECT:2:_ZTSh@@CXXABI_1.3
 OBJECT:2:_ZTSi@@CXXABI_1.3
 OBJECT:2:_ZTSj@@CXXABI_1.3
 OBJECT:2:_ZTSl@@CXXABI_1.3
 OBJECT:2:_ZTSm@@CXXABI_1.3
+OBJECT:2:_ZTSn@@CXXABI_1.3.9
+OBJECT:2:_ZTSo@@CXXABI_1.3.9
 OBJECT:2:_ZTSs@@CXXABI_1.3
 OBJECT:2:_ZTSt@@CXXABI_1.3
 OBJECT:2:_ZTSv@@CXXABI_1.3
 OBJECT:2:_ZTSw@@CXXABI_1.3
 OBJECT:2:_ZTSx@@CXXABI_1.3
 OBJECT:2:_ZTSy@@CXXABI_1.3
 
OBJECT:32:_ZNSbIwSt11char_traitsIwESaIwEE4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
 OBJECT:32:_ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
 OBJECT:32:_ZTIPDd@@CXXABI_1.3.4
 OBJECT:32:_ZTIPDe@@CXXABI_1.3.4
@@ -3148,39 +3153,41 @@ OBJECT:32:_ZTIPKDe@@CXXABI_1.3.4
 OBJECT:32:_ZTIPKDf@@CXXABI_1.3.4
 OBJECT:32:_ZTIPKDi@@CXXABI_1.3.3
 OBJECT:32:_ZTIPKDn@@CXXABI_1.3.5
 OBJECT:32:_ZTIPKDs@@CXXABI_1.3.3
 OBJECT:32:_ZTIPKa@@CXXABI_1.3
 OBJECT:32:_ZTIPKb@@CXXABI_1.3
 OBJECT:32:_ZTIPKc@@CXXABI_1.3
 OBJECT:32:_ZTIPKd@@CXXABI_1.3
 OBJECT:32

Re: [C PATCH] Make attributes accept enum values (PR c/50459)

2014-04-14 Thread Marc Glisse

On Mon, 14 Apr 2014, Marek Polacek wrote:


Currently it is not possible to use an enum value (or const int in
C++) as an attribute argument (where compiler wants INTEGER_CST).  This
is because the values are represented by CONST_DECL nodes -- and
we don't deal with them.  I think it's viable to accept them too,
and that is what this patch does along with some clean ups.


Thanks.


+/* Perform default promotions or extract the integer constant from
+   an enum value.  This function is used for getting a value from
+   an attribute argument.  */
+
+static tree
+get_attrib_value (tree val)
+{
+  if (val
+   TREE_CODE (val) != IDENTIFIER_NODE
+   TREE_CODE (val) != FUNCTION_DECL)
+val = default_conversion (val);
+  else if (TREE_CODE (val) == IDENTIFIER_NODE)
+{
+  tree t = lookup_name (val);
+  if (t  TREE_CODE (t) == CONST_DECL)
+   return DECL_INITIAL (t);
+}
+  return val;
+}


If val is NULL, it seems you still end up looking at its TREE_CODE?
(don't know if that can happen)

In C++, default_conversion probably handles IDENTIFIER_NODE just fine, but 
having different code for the 2 languages would not be nice, right?


--
Marc Glisse


Re: Optimize n?rotate(x,n):x

2014-04-14 Thread Marc Glisse

On Mon, 14 Apr 2014, Richard Biener wrote:


+  /* If the special case has a high probability, keep it.  */
+  if (EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN)


I suppose Honza has a comment on how to test this properly
(not sure if -probability or -frequency is always initialized properly).
for example single_likely_edge tests profile_status_for_fn !=
PROFILE_ABSENT (and uses a fixed probability value ...).
Anyway, the comparison looks backwards to me, but maybe I'm
missing sth - I'd use = PROB_LIKELY ;)


Maybe the comment is confusing? middle_bb contains the expensive operation 
(say a/b) that the special case skips entirely. If the division happens in 
less than 50% of cases (that's the proba of the edge going from cond to 
middle_bb), then doing the comparison+jump may be cheaper and I abort the 
optimization. At least the testcase with __builtin_expect should prove 
that I didn't do it backwards.


value-prof seems to use 50% as the cut-off where it may become interesting 
to special case division, hence my choice of PROB_EVEN. I am not sure 
which way you want to use PROB_LIKELY (80%). If we have more than 80% 
chances of executing the division, always perform it? Or if we have more 
than 80% chances of skipping the division, keep the branch?


Attached is the latest version (passed the testsuite).Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-phiopt1 } */
+
+int f(int a, int b, int c) {
+  if (c  5) return c;
+  if (a == 0) return b;
+  return a + b;
+}
+
+unsigned rot(unsigned x, int n) {
+  const int bits = __CHAR_BIT__ * __SIZEOF_INT__;
+  return (n == 0) ? x : ((x  n) | (x  (bits - n)));
+}
+
+unsigned m(unsigned a, unsigned b) {
+  if (a == 0)
+return 0;
+  else
+return a  b;
+}
+
+/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */
+/* { dg-final { cleanup-tree-dump phiopt1 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+int f(int a, int b) {
+  if (__builtin_expect(a == 0, 1)) return b;
+  return a + b;
+}
+
+// optab_handler can handle if(b==1) but not a/b
+// so we consider a/b too expensive.
+unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) {
+  if (b == 1)
+return a;
+  else
+return a / b;
+}
+
+/* { dg-final { scan-tree-dump-times goto  4 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 209353)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -140,20 +140,37 @@ static bool gate_hoist_loads (void);
x = PHI (CONST, a)
 
Gets replaced with:
  bb0:
  bb2:
t1 = a == CONST;
t2 = b  c;
t3 = t1  t2;
x = a;
 
+
+   It also replaces
+
+ bb0:
+   if (a != 0) goto bb1; else goto bb2;
+ bb1:
+   c = a + b;
+ bb2:
+   x = PHI c (bb1), b (bb0), ...;
+
+   with
+
+ bb0:
+   c = a + b;
+ bb2:
+   x = PHI c (bb0), ...;
+
ABS Replacement
---
 
This transformation, implemented in abs_replacement, replaces
 
  bb0:
if (a = 0) goto bb2; else goto bb1;
  bb1:
x = -a;
  bb2:
@@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;
 
   tmp = gimple_assign_rhs2 (def);
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;
 
   return false;
 }
 
+/* Returns true if ARG is a neutral element for operation CODE
+   on the RIGHT side.  */
+
+static bool
+neutral_element_p (tree_code code, tree arg, bool right)
+{
+  switch (code)
+{
+case PLUS_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  return integer_zerop (arg);
+
+case LROTATE_EXPR:
+case RROTATE_EXPR:
+case LSHIFT_EXPR:
+case RSHIFT_EXPR:
+case MINUS_EXPR:
+case POINTER_PLUS_EXPR:
+  return right  integer_zerop (arg);
+
+case MULT_EXPR:
+  return integer_onep (arg);
+
+case TRUNC_DIV_EXPR:
+case CEIL_DIV_EXPR:
+case FLOOR_DIV_EXPR:
+case ROUND_DIV_EXPR:
+case EXACT_DIV_EXPR:
+  return right  integer_onep (arg);
+
+case BIT_AND_EXPR:
+  return integer_all_onesp (arg);
+
+default:
+  return false;
+}
+}
+
+/* Returns true if ARG is an absorbing element for operation CODE.  */
+
+static bool
+absorbing_element_p (tree_code code, 

Re: C++ PATCH for c++/51747 (list-initialization from same type)

2014-04-14 Thread Marc Glisse

On Fri, 11 Apr 2014, Jason Merrill wrote:

Recent changes to the C++ standard have allowed the use of 
list-initialization with a single initializer of the same type as the target; 
this patch updates reshape_init accordingly.


Tested x86_64-pc-linux-gnu, applying to trunk.


Hello,

shouldn't the same also apply if VECTOR_TYPE_P (type), not just for 
CLASS_TYPE_P (type)?


Testcase in:
http://stackoverflow.com/questions/23070982/c-initialization-of-intel-simd-intrinsics-class-members

--
Marc Glisse


Re: [patch] change specific int128 - generic intN

2014-04-14 Thread Marc Glisse

On Mon, 14 Apr 2014, DJ Delorie wrote:


* rtti.c (emit_support_tinfos): Remove __int128-specific entries.


Hello,

I wanted to check the interaction with:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html

but I don't see where you moved the __int128 typeinfo generation.


[c++] typeinfo for target types

2014-04-13 Thread Marc Glisse

Hello,

this patch generates typeinfo for target types. On x86_64, it adds these 6 
lines to nm -C libsupc++.a. A follow-up patch will be needed to export and 
version those in the shared library.


+ V typeinfo for __float128
+ V typeinfo for __float128 const*
+ V typeinfo for __float128*
+ V typeinfo name for __float128
+ V typeinfo name for __float128 const*
+ V typeinfo name for __float128*

Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in 
tsan/tls_race.c).


2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/c-family/
* c-common.c (registered_builtin_types): Make non-static.
* c-common.h (registered_builtin_types): Declare.
gcc/cp/
* rtti.c (emit_support_tinfo_1): New function, extracted from
emit_support_tinfos.
(emit_support_tinfos): Call it and iterate on registered_builtin_types.

--
Marc GlisseIndex: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 209345)
+++ gcc/c-family/c-common.c (working copy)
@@ -3462,21 +3462,21 @@ c_common_fixed_point_type_for_size (unsi
 fixed-point types that have too many integral and 
 fractional bits together);
   return 0;
 }
 
   return c_common_type_for_mode (mode, satp);
 }
 
 /* Used for communication between c_common_type_for_mode and
c_register_builtin_type.  */
-static GTY(()) tree registered_builtin_types;
+tree registered_builtin_types;
 
 /* Return a data type that has machine mode MODE.
If the mode is an integer,
then UNSIGNEDP selects between signed and unsigned types.
If the mode is a fixed-point mode,
then UNSIGNEDP selects between saturating and nonsaturating types.  */
 
 tree
 c_common_type_for_mode (enum machine_mode mode, int unsignedp)
 {
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h (revision 209345)
+++ gcc/c-family/c-common.h (working copy)
@@ -1006,20 +1006,24 @@ extern void do_warn_double_promotion (tr
 extern void set_underlying_type (tree);
 extern void record_locally_defined_typedef (tree);
 extern void maybe_record_typedef_use (tree);
 extern void maybe_warn_unused_local_typedefs (void);
 extern vectree, va_gc *make_tree_vector (void);
 extern void release_tree_vector (vectree, va_gc *);
 extern vectree, va_gc *make_tree_vector_single (tree);
 extern vectree, va_gc *make_tree_vector_from_list (tree);
 extern vectree, va_gc *make_tree_vector_copy (const vectree, va_gc *);
 
+/* Used for communication between c_common_type_for_mode and
+   c_register_builtin_type.  */
+extern GTY(()) tree registered_builtin_types;
+
 /* In c-gimplify.c  */
 extern void c_genericize (tree);
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
 extern int c_common_valid_pch (cpp_reader *pfile, const char *name, int fd);
 extern void c_common_read_pch (cpp_reader *pfile, const char *name, int fd,
Index: gcc/cp/rtti.c
===
--- gcc/cp/rtti.c   (revision 209345)
+++ gcc/cp/rtti.c   (working copy)
@@ -1458,20 +1458,58 @@ create_tinfo_types (void)
FIELD_DECL, NULL_TREE, integer_type_node),
build_decl (BUILTINS_LOCATION,
FIELD_DECL, NULL_TREE, type_info_ptr_type),
build_decl (BUILTINS_LOCATION,
FIELD_DECL, NULL_TREE, type_info_ptr_type),
NULL);
 
   pop_abi_namespace ();
 }
 
+/* Helper for emit_support_tinfos. Emits the type_info descriptor of
+   a single type.  */
+
+void
+emit_support_tinfo_1 (tree bltn)
+{
+  tree types[3];
+
+  if (bltn == NULL_TREE)
+return;
+  types[0] = bltn;
+  types[1] = build_pointer_type (bltn);
+  types[2] = build_pointer_type (cp_build_qualified_type (bltn,
+ TYPE_QUAL_CONST));
+
+  for (int i = 0; i  3; ++i)
+{
+  tree tinfo = get_tinfo_decl (types[i]);
+  TREE_USED (tinfo) = 1;
+  mark_needed (tinfo);
+  /* The C++ ABI requires that these objects be COMDAT.  But,
+On systems without weak symbols, initialized COMDAT
+objects are emitted with internal linkage.  (See
+comdat_linkage for details.)  Since we want these objects
+to have external linkage so that copies do not have to be
+emitted in code outside the runtime library, we make them
+non-COMDAT here.  
+
+It might also not be necessary to follow this detail of the
+ABI.  */
+  if (!flag_weak || ! targetm.cxx.library_rtti_comdat ())
+   {
+ gcc_assert (TREE_PUBLIC (tinfo)  !DECL_COMDAT (tinfo

[i386] define __SIZEOF_FLOAT128__

2014-04-13 Thread Marc Glisse

Hello,

some people like having a macro to test if a type is available 
(__SIZEOF_INT128__ for instance). This adds macros for __float80 and 
__float128. The types seem to be always available, so I didn't add any 
condition.


If you think this is a bad idea, please close the PR.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR preprocessor/56540
* config/i386/i386-c.c (ix86_target_macros): Define
__SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__.

--
Marc GlisseIndex: gcc/config/i386/i386-c.c
===
--- gcc/config/i386/i386-c.c(revision 209345)
+++ gcc/config/i386/i386-c.c(working copy)
@@ -511,20 +511,27 @@ ix86_target_macros (void)
 
   if (!TARGET_80387)
 cpp_define (parse_in, _SOFT_FLOAT);
 
   if (TARGET_LONG_DOUBLE_64)
 cpp_define (parse_in, __LONG_DOUBLE_64__);
 
   if (TARGET_LONG_DOUBLE_128)
 cpp_define (parse_in, __LONG_DOUBLE_128__);
 
+  if (TARGET_64BIT)
+cpp_define (parse_in, __SIZEOF_FLOAT80__=16);
+  else
+cpp_define (parse_in, __SIZEOF_FLOAT80__=12);
+
+  cpp_define (parse_in, __SIZEOF_FLOAT128__=16);
+
   cpp_define_formatted (parse_in, __ATOMIC_HLE_ACQUIRE=%d, IX86_HLE_ACQUIRE);
   cpp_define_formatted (parse_in, __ATOMIC_HLE_RELEASE=%d, IX86_HLE_RELEASE);
 
   ix86_target_macros_internal (ix86_isa_flags,
   ix86_arch,
   ix86_tune,
   ix86_fpmath,
   cpp_define);
 }
 


Re: Optimize n?rotate(x,n):x

2014-04-13 Thread Marc Glisse

On Mon, 3 Mar 2014, Richard Biener wrote:


On Sat, Mar 1, 2014 at 3:33 PM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

again, a stage 1 patch that I will ping then, but early comments are
welcome.

PR 59100 was asking to transform n?rotate(x,n):x to rotate(x,n) (because it
can be hard to write a strictly valid rotate in plain C). The operation is
really:
(x != neutral) ? x op y : y
where neutral is such that (neutral op y) is always y, so that's what I
implemented (and absorbing elements while I was at it).

For some operations on some platforms, the transformation may not be such a
good idea, in particular if division is very slow and b is 1 most of the
time, then computing a/b may be slower than (b!=1)?a/b:a. The easiest might
be to comment out those operations in the switch for now. I think divisions
are the only ones slow enough to deserve this, though there certainly are
CPUs where multiplication is not so fast, and even for rotate it may not
always be a win if the processor doesn't have a rotate instruction and the
shift amount is almost always 0.


You only handle integer operations, so checking for INTEGER_TYPE_P
early on would make sense.


Ah, I wasn't planning on restricting this to integers but yes, indeed, 
since it ended up that way...



Note that some archs may dispatch to libgcc for integer operations
so it may make sense to check whether that is the case (you can
query optabs to check that) - if the comparison also dispatches to
libgcc then of course the transform would be a win again.


Ok. I am not sure about using cbranch_optab, other ideas for the 
comparison?



Even on x86 TImode division uses libgcc.


This reminds me of PR 58897 (unrelated).

Note also that value-profiling may have created this special case in the 
first place! (gimple_divmod_fixed_value_transform)


Test coverage must be limited if I didn't break the testsuite :-(

value-profiling seems to set edge probabilities when it does the 
transformation, so I am checking only that.



Otherwise I think this is a good transform.  Did you check if it triggers
during GCC bootstrap?


It does, a few times. Java has the obvious:

int padLength = blockSize;
if (length % blockSize != 0)
  padLength = blockSize - length % blockSize;

sched-deps.c has (I didn't check if it was actually this piece of code 
that triggered or another one in the same file):


  if ((ds1  t)  !(ds2  t))
ds |= ds1  t;
  else if (!(ds1  t)  (ds2  t))
ds |= ds2  t;

and we end up seeing (phiopt3) if(a!=0)ds|=a.

And mostly bitmap.h where we see quite a bit of:

  if (_867 != 0)
goto bb 55;
  else
goto bb 56;

  bb 55:
  start_bit_869 = _867 * 128;

  bb 56:
  # start_bit_870 = PHI 0(54), start_bit_869(55)


Bootstrap+testsuite on x86_64-linux-gnu (modulo a minor reformatting, I'm 
retesting anyway).


2014-04-14  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/59100
gcc/
* tree-ssa-phiopt.c (neutral_element_p, absorbing_element_p,
is_cheap_stmt): New functions.
(value_replacement): Handle conditional binary operations with a
neutral or absorbing element.
gcc/testsuite/
* gcc.dg/tree-ssa/phi-opt-12.c: New file.
* gcc.dg/tree-ssa/phi-opt-13.c: Likewise.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-phiopt1 } */
+
+int f(int a, int b, int c) {
+  if (c  5) return c;
+  if (a == 0) return b;
+  return a + b;
+}
+
+unsigned rot(unsigned x, int n) {
+  const int bits = __CHAR_BIT__ * __SIZEOF_INT__;
+  return (n == 0) ? x : ((x  n) | (x  (bits - n)));
+}
+
+unsigned m(unsigned a, unsigned b) {
+  if (a == 0)
+return 0;
+  else
+return a  b;
+}
+
+/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */
+/* { dg-final { cleanup-tree-dump phiopt1 } } */

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+int f(int a, int b) {
+  if (__builtin_expect(a == 0, 1)) return b;
+  return a + b;
+}
+
+// optab_handler can handle if(b==1) but not a/b
+// so we consider a/b too expensive.
+unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) {
+  if (b == 1

Re: [v3] complex functions with expression template reals

2014-04-12 Thread Marc Glisse

On Sat, 12 Apr 2014, Eric Botcazou wrote:


r209321
(for a patch accepted long ago, I guess it is better to leave a trace in
gcc-patches near the commit date)


Nope, see the doc, we have ChangeLog and gcc-cvs for this purpose.


Those don't point to the conversation in gcc-patches where the patch was 
ok-ed by a maintainer (maybe we should start adding URLs pointing to the 
gcc-patches messages in commit logs?).


I also meant it as a hello to whoever reviewed the patch, but it is true 
they likely already read new ChangeLog entries or follow (a filtered 
version of) gcc-cvs.


Thanks, I won't do it again in the future :-)

--
Marc Glisse


Re: [v3] complex functions with expression template reals

2014-04-11 Thread Marc Glisse

On Wed, 26 Feb 2014, Paolo Carlini wrote:


On 02/26/2014 10:57 AM, Jonathan Wakely wrote:

On 24 February 2014 09:10, Paolo Carlini wrote:

Another option would be just using boost/multiprecision/mpfr.hpp when
available. In general, I think it makes sense to have a minimum of
infrastructure enabling tests checking interoperability with boost. If 
only

we had a check_v3_target_header { args } it would be most of it, but it
doesn't seem we do?!? Anyway I guess we can take care of that post 4.9.0 
and

commit the straightforward code tweak now. Jon?

I'd be happy with it for Stage 1, but I don't think we should apply it
now if it is only useful for people doing unusual (unsupported) things
with std::complex.

Ok. Marc, please remember to apply it as soon as Stage 1 re-opens.


r209321
(for a patch accepted long ago, I guess it is better to leave a trace in 
gcc-patches near the commit date)


--
Marc Glisse


Re: [v3] LWG 2106: move_iterator::reference

2014-04-11 Thread Marc Glisse

On Wed, 5 Mar 2014, Jonathan Wakely wrote:


On 5 March 2014 19:36, Marc Glisse wrote:

Hello,

this issue got delayed in LWG, apparently because of a failed improvement
to the wording along the way (happens, that's ok), but there seems to be a
consensus on the resolution and I don't really see the point of waiting (it
changes code that currently returns a reference to a temporary).

Tested on x86_64-linux-gnu. Stage 1?


Yes, OK for Stage 1.  Please put _GLIBCXX_RESOLVE_DEFECTS (or whatever
it is we use elsewhere) in the comment, rather than just DR 2106.


r209323
(again, only posting because this was accepted long ago)

--
Marc Glisse


[i386] Replace builtins with vector extensions

2014-04-11 Thread Marc Glisse

Hello,

the previous discussion on the topic was before we added all those #pragma 
target in *mmintrin.h:


http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html

I believe that removes a large part of the arguments against it. Note that 
I only did a few of the more obvious intrinsics, I am waiting to see if 
this patch is accepted before doing more.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-11  Marc Glisse  marc.gli...@inria.fr

* config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps,
_mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions
instead of builtins.
* config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, _mm_storeh_pd,
_mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd,
_mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64,
_mm_loadh_pd, _mm_loadl_pd): Likewise.
(_mm_sqrt_sd): Fix comment.

--
Marc GlisseIndex: gcc/config/i386/emmintrin.h
===
--- gcc/config/i386/emmintrin.h (revision 209323)
+++ gcc/config/i386/emmintrin.h (working copy)
@@ -161,40 +161,40 @@ _mm_store_pd (double *__P, __m128d __A)
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_storeu_pd (double *__P, __m128d __A)
 {
   __builtin_ia32_storeupd (__P, __A);
 }
 
 /* Stores the lower DPFP value.  */
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_store_sd (double *__P, __m128d __A)
 {
-  *__P = __builtin_ia32_vec_ext_v2df (__A, 0);
+  *__P = __A[0];
 }
 
 extern __inline double __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_cvtsd_f64 (__m128d __A)
 {
-  return __builtin_ia32_vec_ext_v2df (__A, 0);
+  return __A[0];
 }
 
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_storel_pd (double *__P, __m128d __A)
 {
   _mm_store_sd (__P, __A);
 }
 
 /* Stores the upper DPFP value.  */
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_storeh_pd (double *__P, __m128d __A)
 {
-  *__P = __builtin_ia32_vec_ext_v2df (__A, 1);
+  *__P = __A[1];
 }
 
 /* Store the lower DPFP value across two words.
The address must be 16-byte aligned.  */
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_store1_pd (double *__P, __m128d __A)
 {
   _mm_store_pd (__P, __builtin_ia32_shufpd (__A, __A, _MM_SHUFFLE2 (0,0)));
 }
 
@@ -215,86 +215,86 @@ extern __inline int __attribute__((__gnu
 _mm_cvtsi128_si32 (__m128i __A)
 {
   return __builtin_ia32_vec_ext_v4si ((__v4si)__A, 0);
 }
 
 #ifdef __x86_64__
 /* Intel intrinsic.  */
 extern __inline long long __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_cvtsi128_si64 (__m128i __A)
 {
-  return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0);
+  return __A[0];
 }
 
 /* Microsoft intrinsic.  */
 extern __inline long long __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_cvtsi128_si64x (__m128i __A)
 {
-  return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0);
+  return __A[0];
 }
 #endif
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_add_pd (__m128d __A, __m128d __B)
 {
-  return (__m128d)__builtin_ia32_addpd ((__v2df)__A, (__v2df)__B);
+  return __A + __B;
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_add_sd (__m128d __A, __m128d __B)
 {
   return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_sub_pd (__m128d __A, __m128d __B)
 {
-  return (__m128d)__builtin_ia32_subpd ((__v2df)__A, (__v2df)__B);
+  return __A - __B;
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_sub_sd (__m128d __A, __m128d __B)
 {
   return (__m128d)__builtin_ia32_subsd ((__v2df)__A, (__v2df)__B);
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_mul_pd (__m128d __A, __m128d __B)
 {
-  return (__m128d)__builtin_ia32_mulpd ((__v2df)__A, (__v2df)__B);
+  return __A * __B;
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_mul_sd (__m128d __A, __m128d __B)
 {
   return (__m128d)__builtin_ia32_mulsd ((__v2df)__A, (__v2df)__B);
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_div_pd (__m128d __A, __m128d __B)
 {
-  return (__m128d)__builtin_ia32_divpd ((__v2df)__A, (__v2df)__B);
+  return __A / __B;
 }
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_div_sd (__m128d __A, __m128d __B)
 {
   return (__m128d)__builtin_ia32_divsd ((__v2df)__A, (__v2df)__B);
 }
 
 extern __inline __m128d __attribute__

Re: [PATCH] Fix PR c++/60764

2014-04-09 Thread Marc Glisse

On Wed, 9 Apr 2014, Jason Merrill wrote:

Hmm, I would expect the parameter numbering for attribute nonnull and such to 
ignore the 'this' parameter.


The doc for the format attribute says clearly:

Since non-static C++ methods have an implicit this argument, the 
arguments of such methods should be counted from two, not one, when giving 
values for string-index and first-to-check.


It would be strange to count arguments differently for different 
attributes.


--
Marc Glisse


Re: Warn when returning the address of a temporary (middle-end)

2014-04-07 Thread Marc Glisse

On Mon, 7 Apr 2014, Richard Biener wrote:


One hard part is avoiding duplicate warnings. Replacing the address with 0
is a convenient way to do that, so I did it both for my new warning and for
the existing C/C++ ones. The patch breaks
gfortran.dg/warn_target_lifetime_2.f90 because it ends up warning twice. I
didn't touch that front-end because I don't know fortran, and the warning
message Pointer at .1. in pointer assignment might outlive the pointer
target doesn't seem very confident that the thing really is broken enough
to be replaced by 0. I only tested (bootstrap+regression) the default
languages, so ada/go may have a similar issue, to be handled if the approach
seems ok. (I personally wouldn't care about duplicate warnings, but I know
some people can't help complaining about it)

This doesn't actually fix PR 60517, for that I was thinking of checking for
memory reads if the first stop of walk_aliased_vdefs is a clobber (could
also check __builtin_free), though I don't know in which pass to do that
yet.


Note that this will break working programs where inlining causes
those references to no longer be returned (though maybe they
already break with the clobbers we insert).  I remember that POOMA/PETE
was full of this ... (and made it reliably work by flattening all call trees
with such calls).


I mostly see programs that are written and tested using debug mode and 
that appear to be working, but break as soon as they are compiled in 
release mode because of dangling references. Breaking them more seems 
like a helpful thing to me ;-) (especially with a warning).


But I would be happy not replacing the pointer and only emitting a 
warning, if people don't mind duplicate warnings or if we can find another 
way to avoid them.



+   if (warning_at (gimple_location (stmt), OPT_Wreturn_local_addr,
+   function returns address of local variable))


That's a front-end option, I should either move it to gcc/common.opt or 
use a different option.


--
Marc Glisse


Re: Warn when returning the address of a temporary (middle-end)

2014-04-07 Thread Marc Glisse

On Mon, 7 Apr 2014, Jeff Law wrote:


On 04/05/14 07:52, Marc Glisse wrote:

Hello,

we have front-end warnings about returning the address of a local
variable. However, quite often in C++, people don't directly return the
address of a temporary, it goes through a few functions which hide that
fact. After some inlining, the fact that we are returning the address of
a local variable can become obvious to the compiler, to the point where
I have used, for debugging purposes, grep 'return ' on the optimized
dump produced with -O3 -fkeep-inline-functions (I then had to sort
through the global/local variables).

fold_stmt looks like a good place for this, but it could go almost
anywhere. It has to happen after enough inlining / copy propagation to
make it useful though.
I was wondering if this would be better implemented as a propagation problem 
so that cases where some, but not all paths to the return statement have 
local which reaches the return.  ie


...
if (foo)
 x = local
else
 x = global

return x;

ISTM it ought to be a standard propagation problem and that the problematical 
ADDR_EXPRs are all going to be in PHI nodes.  So I think you just search for 
problematical ADDR_EXPRs in PHI nodes as your seeds, then forward propagate 
through the CFG.  Ultimately looking for any cases where those ADDR_EXPRs 
ultimately reach the return statement.


Thoughts?


I would tend to start from the return statements (assuming the return type 
is a pointer), look at the defining statement, do things if it is an 
assignment of an addr_expr, and recurse if it is a PHI. But maybe my brain 
is cabled backwards ;-)


It would be good to piggy back on a pass that already does something 
similar, if we go that way. Do you know a convenient one?


I am also afraid we may get more false positives, but maybe not.

Last, the simple version actually works well enough that it discovered at 
least one real bug in real code, and I am afraid that by refining it too 
much we'll delay and get nothing in the end (my time and my knowledge of 
the compiler are limited enough to make it a real possibility). But I 
admit that's not a good argument.


Thanks for the comments,

--
Marc Glisse


Re: Warn when returning the address of a temporary (middle-end)

2014-04-07 Thread Marc Glisse

On Mon, 7 Apr 2014, Jeff Law wrote:


I am also afraid we may get more false positives, but maybe not.
The only false positives should come from paths which are unexecutable.  One 
could argue that if we find any that we should warn, then isolate the path so 
that we get an immediate runtime trap rather than letting the address of the 
local escape through the return value.


That in turn would argue for dumping it into gimple-isolate-erroneous-paths 
;-)


I wonder if trapping may be too strong? If a function sometimes returns a 
pointer to a local variable but the caller always ignores the return value 
in those cases, we can warn, but maybe we don't want to introduce a trap? 
Replacing the address with a null pointer seemed like a compromise, it 
will trap when you try to read it, but not if you ignore it. But if you 
think we can trap, ok.


--
Marc Glisse


Warn when returning the address of a temporary (middle-end)

2014-04-05 Thread Marc Glisse

Hello,

we have front-end warnings about returning the address of a local 
variable. However, quite often in C++, people don't directly return the 
address of a temporary, it goes through a few functions which hide that 
fact. After some inlining, the fact that we are returning the address of a 
local variable can become obvious to the compiler, to the point where I 
have used, for debugging purposes, grep 'return ' on the optimized dump 
produced with -O3 -fkeep-inline-functions (I then had to sort through the 
global/local variables).


fold_stmt looks like a good place for this, but it could go almost 
anywhere. It has to happen after enough inlining / copy propagation to 
make it useful though.


One hard part is avoiding duplicate warnings. Replacing the address with 0 
is a convenient way to do that, so I did it both for my new warning and 
for the existing C/C++ ones. The patch breaks 
gfortran.dg/warn_target_lifetime_2.f90 because it ends up warning twice. I 
didn't touch that front-end because I don't know fortran, and the warning 
message Pointer at .1. in pointer assignment might outlive the pointer 
target doesn't seem very confident that the thing really is broken enough 
to be replaced by 0. I only tested (bootstrap+regression) the default 
languages, so ada/go may have a similar issue, to be handled if the 
approach seems ok. (I personally wouldn't care about duplicate warnings, 
but I know some people can't help complaining about it)


This doesn't actually fix PR 60517, for that I was thinking of checking 
for memory reads if the first stop of walk_aliased_vdefs is a clobber 
(could also check __builtin_free), though I don't know in which pass to do 
that yet.


2014-04-05  Marc Glisse  marc.gli...@inria.fr

PR c++/60517
gcc/c/
* c-typeck.c (c_finish_return): Return 0 instead of the address of
a local variable.
gcc/cp/
* typeck.c (check_return_expr): Likewise.
(maybe_warn_about_returning_address_of_local): Tell the caller if
we warned.
gcc/
* gimple-fold.c (fold_stmt_1): Warn if returning the address of a
local variable.
gcc/testsuite/
* c-c++-common/addrtmp.c: New testcase.
* c-c++-common/uninit-G.c: Adjust.

--
Marc GlisseIndex: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c(revision 209157)
+++ gcc/c/c-typeck.c(working copy)
@@ -9254,23 +9254,25 @@ c_finish_return (location_t loc, tree re
  inner = TREE_OPERAND (inner, 0);
 
  while (REFERENCE_CLASS_P (inner)
  TREE_CODE (inner) != INDIRECT_REF)
inner = TREE_OPERAND (inner, 0);
 
  if (DECL_P (inner)
   !DECL_EXTERNAL (inner)
   !TREE_STATIC (inner)
   DECL_CONTEXT (inner) == current_function_decl)
-   warning_at (loc,
-   OPT_Wreturn_local_addr, function returns address 
-   of local variable);
+   {
+ warning_at (loc, OPT_Wreturn_local_addr,
+ function returns address of local variable);
+ t = build_zero_cst (TREE_TYPE (res));
+   }
  break;
 
default:
  break;
}
 
  break;
}
 
   retval = build2 (MODIFY_EXPR, TREE_TYPE (res), res, t);
Index: gcc/cp/typeck.c
===
--- gcc/cp/typeck.c (revision 209157)
+++ gcc/cp/typeck.c (working copy)
@@ -49,21 +49,21 @@ static tree convert_for_assignment (tree
 static tree cp_pointer_int_sum (enum tree_code, tree, tree, tsubst_flags_t);
 static tree rationalize_conditional_expr (enum tree_code, tree, 
  tsubst_flags_t);
 static int comp_ptr_ttypes_real (tree, tree, int);
 static bool comp_except_types (tree, tree, bool);
 static bool comp_array_types (const_tree, const_tree, bool);
 static tree pointer_diff (tree, tree, tree, tsubst_flags_t);
 static tree get_delta_difference (tree, tree, bool, bool, tsubst_flags_t);
 static void casts_away_constness_r (tree *, tree *, tsubst_flags_t);
 static bool casts_away_constness (tree, tree, tsubst_flags_t);
-static void maybe_warn_about_returning_address_of_local (tree);
+static bool maybe_warn_about_returning_address_of_local (tree);
 static tree lookup_destructor (tree, tree, tree, tsubst_flags_t);
 static void warn_args_num (location_t, tree, bool);
 static int convert_arguments (tree, vectree, va_gc **, tree, int,
   tsubst_flags_t);
 
 /* Do `exp = require_complete_type (exp);' to make sure exp
does not have an incomplete type.  (That includes void types.)
Returns error_mark_node if the VALUE does not have
complete type when this function returns.  */
 
@@ -8253,79 +8253,81 @@ convert_for_initialization (tree exp, tr
 return rhs

Re: Fix ipa-devirt ICE

2014-04-03 Thread Marc Glisse

On Thu, 3 Apr 2014, Jan Hubicka wrote:


+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+   OTR_TOKEN == INT_MAX is used to mark calls that are provably


Did you mean provably instead of probably in the first two?

--
Marc Glisse


Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Marc Glisse

On Wed, 2 Apr 2014, Thomas Preud'homme wrote:

Note that as it stands the patch does not work for arrays indexed with 
variable (such a tab[a] || (tab[a+1]  8)) because fold_const does not 
fold (a + 1) - a.


Uh? It does fold a+1-a for me. What it doesn't do is look through the 
definition of b in b-a. Richard+GSoC will supposedly soon provide a 
function that does that.


--
Marc Glisse


Re: [PATCH] Fix undefined behavior in IRA

2014-03-26 Thread Marc Glisse

On Tue, 25 Mar 2014, Marek Polacek wrote:


This is a temporary fix for UB in IRA, where ubsan complains because
there's signed iteger overflow in the multiplication.  To shut this
error up, we can perform the multiplication in unsigned and only then
cast the result of the multiplication to int.


Naive question: why do you want to shut the error up? If modular 
arithmetic makes sense for costs (sounds doubtful), they should use an 
unsigned type to begin with. Otherwise, this is making it harder to notice 
a bug (doesn't sound like an improvement).


If you want to cast, doesn't something like long long (HOST_WIDEST_INT?) 
make more sense than unsigned?


--
Marc Glisse


Re: [PATCH] Add test for PR c++/53711

2014-03-24 Thread Marc Glisse

On Sun, 23 Mar 2014, Patrick Palka wrote:


On Fri, Mar 14, 2014 at 11:20 AM, Jason Merrill ja...@redhat.com wrote:

Applied, thanks.


Thanks!  The commit seems to be missing the corresponding ChangeLog
entry though.


ChangeLog is kind of optional in the testsuite. Please keep providing 
entries, but don't be surprised when some go missing.


--
Marc Glisse


Re: calloc = malloc + memset

2014-03-23 Thread Marc Glisse

On Mon, 3 Mar 2014, Richard Biener wrote:


That's a bit much of ad-hoc pattern-matching ... wouldn't be
p = malloc (n);
memset (p, 0, n);

transform better suited to the strlen opt pass?  After all that tracks
what 'string' is associated with a SSA name pointer through
arbitrary satements using a lattice.


Like this? I had to move the strlen pass after the loop passes (and
after dom or everything was too dirty) but long enough before the end
(some optimizations are necessary after strlen). As a bonus, one more
strlen is optimized in the current testcases :-)

Running the pass twice would be another option I guess (it would require
implementing the clone method), but without a testcase showing it is
needed...

Passes bootstrap+testsuite on x86_64-linux-gnu.

2014-03-23  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/57742
gcc/
* tree-ssa-strlen.c (get_string_length): Ignore malloc.
(handle_builtin_malloc, handle_builtin_memset): New functions.
(strlen_optimize_stmt): Call them.
* passes.def: Move strlen after loop+dom.
gcc/testsuite/
* g++.dg/tree-ssa/calloc.C: New testcase.
* gcc.dg/tree-ssa/calloc-1.c: Likewise.
* gcc.dg/tree-ssa/calloc-2.c: Likewise.
* gcc.dg/strlenopt-9.c: Adapt.

--
Marc GlisseIndex: gcc/passes.def
===
--- gcc/passes.def  (revision 208772)
+++ gcc/passes.def  (working copy)
@@ -176,21 +176,20 @@ along with GCC; see the file COPYING3.
 DOM and erroneous path isolation should be due to degenerate PHI nodes.
 So rather than run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
-  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_ccp);
   /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_cse_sincos);
   NEXT_PASS (pass_optimize_bswap);
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
   NEXT_PASS (pass_asan);
@@ -235,20 +234,21 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_dominator);
   /* The only const/copy propagation opportunities left after
 DOM should be due to degenerate PHI nodes.  So rather than
 run the full propagators, run a specialized pass which
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
+  NEXT_PASS (pass_strlen);
   NEXT_PASS (pass_vrp);
   NEXT_PASS (pass_cd_dce);
   NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
   NEXT_PASS (pass_fold_builtins);
   NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tail_calls);
   NEXT_PASS (pass_rename_ssa_copies);
Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C
===
--- gcc/testsuite/g++.dg/tree-ssa/calloc.C  (revision 0)
+++ gcc/testsuite/g++.dg/tree-ssa/calloc.C  (working copy)
@@ -0,0 +1,35 @@
+/* { dg-do compile { target c++11 } } */
+/* { dg-options -O3 -fdump-tree-optimized } */
+
+#include new
+#include vector
+#include cstdlib
+
+void g(void*);
+inline void* operator new(std::size_t sz)
+{
+  void *p;
+
+  if (sz == 0)
+sz = 1;
+
+  // Slightly modified from the libsupc++ version, that one has 2 calls
+  // to malloc which makes it too hard to optimize.
+  while ((p = std::malloc (sz)) == 0)
+{
+  std::new_handler handler = std::get_new_handler ();
+  if (! handler)
+throw std::bad_alloc();
+  handler ();
+}
+  return p;
+}
+
+void f(void*p,int n){
+  new(p)std::vectorint(n);
+}
+
+/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
+/* { dg-final { scan-tree-dump-not malloc optimized } } */
+/* { dg-final { scan-tree-dump-not memset optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */

Property changes on: gcc/testsuite/g++.dg/tree-ssa/calloc.C
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/testsuite/gcc.dg/strlenopt-9.c
===
--- gcc/testsuite/gcc.dg/strlenopt-9.c  (revision 208772)
+++ gcc/testsuite/gcc.dg/strlenopt-9.c  (working copy)
@@ -11,21 +11,21 @@ fn1 (int r

<    5   6   7   8   9   10   11   12   13   14   >