[Bug c++/45605] Missed devirtualization
--- Comment #21 from hubicka at gcc dot gnu dot org 2010-09-20 15:53 --- OK, we now fold the testcase using obj_type_ref folding. We still should do it via vtable lookup and we don't but that is for other PR I guess. -- hubicka at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #20 from hubicka at gcc dot gnu dot org 2010-09-20 15:48 --- Subject: Bug 45605 Author: hubicka Date: Mon Sep 20 15:48:42 2010 New Revision: 164438 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164438 Log: PR tree-optimize/45605 * cgraph.h (const_value_known_p): Declare. (varpool_decide_const_value_known): Remove. * tree-ssa-ccp.c (get_base_constructor): Use it. * lto-cgraph.c (compute_ltrans_boundary): Likewise. * expr.c (string_constant): Likewise. * tree-ssa-loop-ivcanon.c (constant_after_peeling): Likewise. * ipa.c (ipa_discover_readonly_nonaddressable_var, function_and_variable_visibility): Likewise. * gimplify.c (gimplify_call_expr): Likewise. * gimple-fold.c (get_symbol_constant_value): Likewise. * varpool.c (varpool_decide_const_value_known): Replace by... (const_value_known_p): ... this one; handle other kinds of DECLs too and work for automatic vars. (varpool_finalize_decl): Use const_value_known_p. * lto.c (lto_promote_cross_file_statics): Use const_value_known_p. * g++.dg/tree-ssa/pr45605.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr45605.C Modified: trunk/gcc/ChangeLog trunk/gcc/cgraph.h trunk/gcc/expr.c trunk/gcc/gimple-fold.c trunk/gcc/gimplify.c trunk/gcc/ipa.c trunk/gcc/lto-cgraph.c trunk/gcc/lto/ChangeLog trunk/gcc/lto/lto.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-ccp.c trunk/gcc/tree-ssa-loop-ivcanon.c trunk/gcc/varpool.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #19 from hubicka at gcc dot gnu dot org 2010-09-18 21:25 --- Subject: Bug 45605 Author: hubicka Date: Sat Sep 18 21:25:29 2010 New Revision: 164402 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164402 Log: PR tree-optimization/45605 * cgraphunit.c (cgraph_analyze_functions): Allocate bitmap obstack. * gimple-fold.c (static_object_in_other_unit_p): New function. (canonicalize_constructor_val): Use it. (get_symbol_constant_value): Be reaqdy for canonicalize_constructor_val returning NULL. (gimple_fold_obj_type_ref_known_binfo): Use static_object_in_other_unit_p. Modified: trunk/gcc/ChangeLog trunk/gcc/cgraphunit.c trunk/gcc/gimple-fold.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #18 from jamborm at gcc dot gnu dot org 2010-09-17 14:46 --- Honza submitted a patch (http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01369.html) so I guess it is his PR now :-) -- jamborm at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|jamborm at gcc dot gnu dot |hubicka at gcc dot gnu dot |org |org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #17 from rguenther at suse dot de 2010-09-16 16:06 --- Subject: Re: Missed devirtualization On Thu, 16 Sep 2010, jamborm at gcc dot gnu dot org wrote: > > > --- Comment #16 from jamborm at gcc dot gnu dot org 2010-09-16 16:00 > --- > (In reply to comment #15) > > Like > > > > Index: gimplify.c > > === > > --- gimplify.c (revision 164333) > > +++ gimplify.c (working copy) > > @@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple > > gimplify_modify_expr. */ > >if (!want_value) > > { > > + gimple_stmt_iterator gsi; > >/* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we > > have to do is replicate it as a GIMPLE_CALL tuple. */ > >call = gimple_build_call_from_tree (*expr_p); > >gimplify_seq_add_stmt (pre_p, call); > > + gsi = gsi_last (*pre_p); > > + fold_stmt (&gsi); > >*expr_p = NULL_TREE; > > } > > > > Will this also work also for GIMPLE_CALLs with a LHS or do I have to > add something like the above also elsewhere? Yes. > > > > but gimple_fold_obj_type_ref_known_binfo returns NULL. > > > > cgraph_function_flags_ready needs to be added to the conjunction > (!node->analyzed && !node->in_other_partition) and then it is folded. > I'll prepare a patch tomorrow. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #16 from jamborm at gcc dot gnu dot org 2010-09-16 16:00 --- (In reply to comment #15) > Like > > Index: gimplify.c > === > --- gimplify.c (revision 164333) > +++ gimplify.c (working copy) > @@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple > gimplify_modify_expr. */ >if (!want_value) > { > + gimple_stmt_iterator gsi; >/* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we > have to do is replicate it as a GIMPLE_CALL tuple. */ >call = gimple_build_call_from_tree (*expr_p); >gimplify_seq_add_stmt (pre_p, call); > + gsi = gsi_last (*pre_p); > + fold_stmt (&gsi); >*expr_p = NULL_TREE; > } > Will this also work also for GIMPLE_CALLs with a LHS or do I have to add something like the above also elsewhere? > > but gimple_fold_obj_type_ref_known_binfo returns NULL. > cgraph_function_flags_ready needs to be added to the conjunction (!node->analyzed && !node->in_other_partition) and then it is folded. I'll prepare a patch tomorrow. Thanks, Martin -- jamborm at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |jamborm at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2010-09-16 12:25:30 |2010-09-16 16:00:08 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #15 from rguenth at gcc dot gnu dot org 2010-09-16 14:55 --- Like Index: gimplify.c === --- gimplify.c (revision 164333) +++ gimplify.c (working copy) @@ -2477,10 +2477,13 @@ gimplify_call_expr (tree *expr_p, gimple gimplify_modify_expr. */ if (!want_value) { + gimple_stmt_iterator gsi; /* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we have to do is replicate it as a GIMPLE_CALL tuple. */ call = gimple_build_call_from_tree (*expr_p); gimplify_seq_add_stmt (pre_p, call); + gsi = gsi_last (*pre_p); + fold_stmt (&gsi); *expr_p = NULL_TREE; } but gimple_fold_obj_type_ref_known_binfo returns NULL. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #14 from rguenther at suse dot de 2010-09-16 12:51 --- Subject: Re: Missed devirtualization On Thu, 16 Sep 2010, hubicka at ucw dot cz wrote: > --- Comment #13 from hubicka at ucw dot cz 2010-09-16 12:48 --- > Subject: Re: Missed devirtualization > > > I'm lost in this PR - for what testcase what statement needs folding > > (and what pending patches do I need to apply to see that)? > PR is tracking missed optimization in the testcase in comment 0. > > There are two issues > > > > 1) OBJ_TYPE_REF folding should handle it. For that we seem to need to > evern call fold on > OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); > this you can see on mainline > > 2) generic folding should work it out the hard way. I.e. for: > MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; > d.D.2108._vptr.B = &_ZTV1D[2]; > D.2210_2 = _ZTV1D[2]; > OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); > there is nothing that prevents us to resolve _ZTV1D[2] into pointer to Run (by > looking into initializer of vtable variable) and then take OBJ_TYPE_REF away > since it is pointless when first operand is known function. > > With patch http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html we closer > in > a way that .vpr1 dump has: > MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; > d.D.2078._vptr.B = &_ZTV1D[2]; > D.2179_1 = &_ZTV1D[2]; > D.2180_2 = (int (*__vtbl_ptr_type) (void)) Run; > OBJ_TYPE_REF(D.2180_2;&d.D.2078->0) (&d.D.2078); > > Somewhere I have patch that adds OBJ_TYPE_REF folding into CCP (so when first > argument is function pointer, we just fold into direct call). I will update it > and submit after http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html > is resolved. Then still there is problem that resolution comes late > since we need FRE to fold > d.D.2078._vptr.B = &_ZTV1D[2]; > D.2179_1 = d.D.2078._vptr.B > and FRE is not run early (and we should devirutalize everything early > to get inlining) > > > > 1) is priority IMO (at moment we make amazingly little devirtualization at > Mozilla, about 20 calls). > 2) is just side effect of my attempt to get folding working that I run into > while looking into kernel poor man C vtables (and ours targhooks). 1) should be fixed by using fold_stmt in gimplify_call instead of just fold_call_expr. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #13 from hubicka at ucw dot cz 2010-09-16 12:48 --- Subject: Re: Missed devirtualization > I'm lost in this PR - for what testcase what statement needs folding > (and what pending patches do I need to apply to see that)? PR is tracking missed optimization in the testcase in comment 0. There are two issues 1) OBJ_TYPE_REF folding should handle it. For that we seem to need to evern call fold on OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); this you can see on mainline 2) generic folding should work it out the hard way. I.e. for: MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; d.D.2108._vptr.B = &_ZTV1D[2]; D.2210_2 = _ZTV1D[2]; OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); there is nothing that prevents us to resolve _ZTV1D[2] into pointer to Run (by looking into initializer of vtable variable) and then take OBJ_TYPE_REF away since it is pointless when first operand is known function. With patch http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html we closer in a way that .vpr1 dump has: MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; d.D.2078._vptr.B = &_ZTV1D[2]; D.2179_1 = &_ZTV1D[2]; D.2180_2 = (int (*__vtbl_ptr_type) (void)) Run; OBJ_TYPE_REF(D.2180_2;&d.D.2078->0) (&d.D.2078); Somewhere I have patch that adds OBJ_TYPE_REF folding into CCP (so when first argument is function pointer, we just fold into direct call). I will update it and submit after http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01190.html is resolved. Then still there is problem that resolution comes late since we need FRE to fold d.D.2078._vptr.B = &_ZTV1D[2]; D.2179_1 = d.D.2078._vptr.B and FRE is not run early (and we should devirutalize everything early to get inlining) 1) is priority IMO (at moment we make amazingly little devirtualization at Mozilla, about 20 calls). 2) is just side effect of my attempt to get folding working that I run into while looking into kernel poor man C vtables (and ours targhooks). Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #12 from rguenther at suse dot de 2010-09-16 12:31 --- Subject: Re: Missed devirtualization On Thu, 16 Sep 2010, hubicka at gcc dot gnu dot org wrote: > --- Comment #11 from hubicka at gcc dot gnu dot org 2010-09-16 12:25 > --- > Hmm, so do you have any idea where folding should be added for this particular > case? > > It always seemed to me that it would make sense to add verifier that all > statements are folded (locally, not by looking at SSA graph) at optimization > pass boundaries. Tried it in the past and found a lot of issues that got > fixed, > but we never got into agreeing on any policy here. > > Missing optimizations just because we are stupid enough to forget call fold > when updating something seems bad IMO. I'm lost in this PR - for what testcase what statement needs folding (and what pending patches do I need to apply to see that)? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #11 from hubicka at gcc dot gnu dot org 2010-09-16 12:25 --- Hmm, so do you have any idea where folding should be added for this particular case? It always seemed to me that it would make sense to add verifier that all statements are folded (locally, not by looking at SSA graph) at optimization pass boundaries. Tried it in the past and found a lot of issues that got fixed, but we never got into agreeing on any policy here. Missing optimizations just because we are stupid enough to forget call fold when updating something seems bad IMO. -- hubicka at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-09-16 12:25:30 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #10 from rguenther at suse dot de 2010-09-16 08:50 --- Subject: Re: Missed devirtualization On Wed, 15 Sep 2010, hubicka at ucw dot cz wrote: > --- Comment #9 from hubicka at ucw dot cz 2010-09-15 22:39 --- > Subject: Re: Missed devirtualization > > > We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop, > > dom or by inlining). > Well, since fold_stmt is stornger than what fe does, I guess we should fold > each stmt at least once. No we shouldn't. We do fold some stmts during gimplification. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #9 from hubicka at ucw dot cz 2010-09-15 22:39 --- Subject: Re: Missed devirtualization > We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop, > dom or by inlining). Well, since fold_stmt is stornger than what fe does, I guess we should fold each stmt at least once. Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #8 from rguenther at suse dot de 2010-09-15 19:09 --- Subject: Re: Missed devirtualization On Wed, 15 Sep 2010, jamborm at gcc dot gnu dot org wrote: > --- Comment #7 from jamborm at gcc dot gnu dot org 2010-09-15 18:42 > --- > Well, it turns out that fold_stmt_1 is never called on that statement > (neither with -O -finline or -O2 or -O3). Where is it supposed to be > called from? We fold a stmt only if it is propagated to (by ccp, copyprop, forwprop, dom or by inlining). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #7 from jamborm at gcc dot gnu dot org 2010-09-15 18:42 --- Well, it turns out that fold_stmt_1 is never called on that statement (neither with -O -finline or -O2 or -O3). Where is it supposed to be called from? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #6 from hubicka at gcc dot gnu dot org 2010-09-10 09:42 --- Subject: Bug 45605 Author: hubicka Date: Fri Sep 10 09:42:20 2010 New Revision: 164148 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164148 Log: PR tree-optimization/45605 * cp/class.c (build_vtbl_initializer): Avoid wrong type conversion in ADDR_EXPR Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/class.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/inherit/covariant7.C -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #5 from hubicka at gcc dot gnu dot org 2010-09-09 11:36 --- Created an attachment (id=21750) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21750&action=view) WIP patch. Still misses some of Richi's earlier comments. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #4 from hubicka at gcc dot gnu dot org 2010-09-09 11:33 --- testing the following fix: Index: class.c === --- class.c (revision 163947) +++ class.c (working copy) @@ -7797,7 +7797,7 @@ build_vtbl_initializer (tree binfo, { fn = abort_fndecl; if (abort_fndecl_addr == NULL) - abort_fndecl_addr = build1 (ADDR_EXPR, vfunc_ptr_type_node, fn); + fold_convert (vfunc_ptr_type_node, build_fold_addr_expr (fn)); init = abort_fndecl_addr; } else @@ -7810,7 +7810,7 @@ build_vtbl_initializer (tree binfo, } /* Take the address of the function, considering it to be of an appropriate generic type. */ - init = build1 (ADDR_EXPR, vfunc_ptr_type_node, fn); + init = fold_convert (vfunc_ptr_type_node, build_fold_addr_expr (fn)); } this solves the ICE, but won't get call devirtualized. We end up with: MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; d.D.2108._vptr.B = &_ZTV1D[2]; D.2210_2 = (int (*__vtbl_ptr_type) (void)) Run; OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); It seems that forw-prop should be told that OBJ_TYPE_REF does not care about nops on the operand and also fold_ccp can be told to call fold_obj_type_ref on the substituted constant. Martin, can you take a look, please? I am attaching current WIP version of my constructor folding that makes D.2210_2 = (int (*__vtbl_ptr_type) (void)) Run; to happen. Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #3 from hubicka at gcc dot gnu dot org 2010-09-09 11:02 --- Hmm, is it? C equivalent IMO is: int a(void); typedef void (*ptr) (void); static const ptr array[1]={(ptr)a}; ptr t; main() { t=array[0]; } Here we have ctor represented as follows: QI size unit size align 8 symtab 0 alias set -1 canonical type 0x77eee690 arg-types > pointer_to_this > unsigned DI size unit size align 64 symtab 0 alias set -1 canonical type 0x76b383f0 pointer_to_this > constant arg 0 unsigned DI size unit size align 64 symtab 0 alias set -1 canonical type 0x76b38bd0> constant arg 0 addressable used public external decl_5 QI file t.c line 1 col 5 align 8 chain > t.c:5:33> t.c:5:28> this gets into canonicalize_constructor_val and it does STRIP_NOPS so we return addr_expr with different type than array_ref from fold_const_aggregate_ref (this code is copied from old implementation, the STRIP_NOPS there seemed bid odd to me originally) and this lands in ccp_fold that is happy and CCP re-inserts the nop later. I guess C++ FE is wrong to produce type mismatch in addr_expr? That should be easy to fix. I wonder about other users of fold_const_aggregate_ref. Shall they also work on re-inserting the conversions to avoid possible type mismatch? Obviously more worms... Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #2 from rguenther at suse dot de 2010-09-09 08:43 --- Subject: Re: New: Missed devirtualization On Thu, 9 Sep 2010, hubicka at gcc dot gnu dot org wrote: > The following testcase taken from testsuite (and many others): > // PR 14535 > // { dg-do run } > // { dg-options "-O -finline" } > // > // Original test case failure required that Raiser constructor be inlined. > > extern "C" void abort(); > bool destructor_called = false; > > struct B { > virtual void Run(){}; > }; > > struct D : public B { > virtual void Run() > { > struct O { > ~O() { destructor_called = true; }; > } o; > > struct Raiser { > Raiser() throw( int ) {throw 1;}; > } raiser; > }; > }; > > int main() { > try { > D d; > static_cast(d).Run(); > } catch (...) {} > > if (!destructor_called) > abort (); > } > > leads to following in .optimized dump: > : > MEM[(struct B *)&d]._vptr.B = &_ZTV1B[2]; > d.D.2108._vptr.B = &_ZTV1D[2]; > D.2210_2 = _ZTV1D[2]; > OBJ_TYPE_REF(D.2210_2;&d.D.2108->0) (&d.D.2108); > > Obviously OBJ_TYPE_REF can be optimized here since we know the > virtual table. Why it is missed? > > I run into it because my new folding code is actually smart enough to lead to: > int (*__vtbl_ptr_type) (void) > > void D:: (struct D *) > > D.2210_2 = Run; > > The error comes because we pick from initializer of VTBL the following > ADDR_EXPR: > type type int> > type_6 QI > size > unit size > align 8 symtab 0 alias set -1 canonical type 0x76b7da80 > pointer_to_this > > unsigned type_6 DI > size > unit size > align 64 symtab 0 alias set -1 canonical type 0x76b7db28 > pointer_to_this > > constant > arg 0 type > type_6 QI size unit size > > align 8 symtab 0 alias set -1 canonical type 0x76b9b9d8 method > basetype > arg-types 0x76b9ba80> > chain 0x77edbe70 > void>>> > pointer_to_this > > addressable volatile asm_written used public static weak autoinline > virtual decl_5 QI file /home/jh/a.C line 15 col 18 align 16 context > initial > arguments 0x76b9bb28> > readonly unsigned DI file /home/jh/a.C line 15 col 22 size > unit size > align 64 context > (reg/f:DI 62 [ this ]) arg-type > incoming-rtl (reg:DI 5 di [ this ])> > result 0x77edbe70 void> > ignored VOID file /home/jh/a.C line 16 col 7 > align 8 context > > full-name "virtual void D::Run()" > pending-inline-info 0x76ba42a0 > (mem:QI (symbol_ref/i:DI ("_ZN1D3RunEv") [flags 0x1] 0x76b8e500 Run>) [0 S1 A8])>> > > Note the differece in between type of address and the method RUN. This is > what > makes verifier unhappy. How to fix this? Should FE put there some NOP_EXPR > somewhere? Still devirtualizing should realize this case a lot earlier. Well - it's easy. Whoever does substitution needs to verify it can do so. Which of course is tricky with the stupid way we do substitute_and_fold ... Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605
[Bug c++/45605] Missed devirtualization
--- Comment #1 from pinskia at gcc dot gnu dot org 2010-09-09 01:17 --- I think this is the same issue as PR 19816. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45605