[Bug tree-optimization/45791] Missed devirtualization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 Martin Jambor changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #18 from Martin Jambor --- The release_ssa dump of function main in the testcase from bug description is just "return 0;" The release_ssa dump of function main the testcase from comment #15 is: int main () { [local count: 1073741824]: Sub::bar (&foo); Sub::bar (&foo); return 0; } ...so devirtualized in my book (main is cold, so they are not inlined, of course). The release_ssa dump of function foo in the testcase from comment #5 is: void foo () { void * _3; [local count: 1073741824]: _3 = operator new (8); MEM[(struct X *)_3]._vptr.X = &MEM [(void *)&_ZTV1X + 16B]; X::~X (_3); return; } ...so also devirtualzied before IPA. Therefore I am going to close this old bug as done. If anyone finds a different testcase where the early optimizers should do a better job to help devirtualziation, please file a new bug.
[Bug tree-optimization/45791] Missed devirtualization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #17 from Eric Gallager --- This is the last bug still open blocking bug 45375
[Bug tree-optimization/45791] Missed devirtualization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #16 from Jan Hubicka --- With new polymorphic call code we could handle the testcase from comment 15 by simple propagation of contextes down during early optimization. This may be easy enough to be worth to implement. Still general solution is to add return functions to ipa-prop.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 Matthijs Kooijman changed: What|Removed |Added CC||matthijs at stdin dot nl --- Comment #15 from Matthijs Kooijman --- I ran into another variant of this problem, which I reduced to the following testcase. I found the problem on 4.8.2, but it is already fixed in trunk / gcc-4.9 (Debian 4.9-20140218-1). Still, it might be useful to have the testcase here for reference. class Base { }; class Sub : public Base { public: virtual void bar(); }; Sub foo; Sub * const pointer = &foo; Sub* function() { return &foo; }; int main() { // Gcc 4.8.2 devirtualizes this: pointer->bar(); // but not this: function()->bar(); }
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #14 from Martin Jambor 2010-12-15 16:07:31 UTC --- (In reply to comment #9) > OK, main() code seems to optimize out that is an imrovement. Is it optimized > away with your patch pre-IPA too? Yes. Just before IPA, in fact. > > Derived() is also devirtualizable: > It could be done intraprocedurally if, unlike for automatically allocated decls, we considered all calls as potentially changing the dynamic type in unknown ways while doing the dynamic type change detection. This then however becomes essentially the same thing as devirtualization based on the constant folding, perhaps only more complicated. It could be optimized by IPA-CP if we can confirm that all callers are automatically allocated (or that we generally have them under control). This should be relatively easy, I will have a look at that. Mainly because we may be able to widen scope of objects under our control later.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #13 from Jan Hubicka 2010-12-15 00:12:54 UTC --- ... but obviously the problem still is that we don't devirtualize this early enough for inlining. The low level code should be able to do so if FRE+CCP was added as early pass or FRE was also extended to fold the OBJ_TYEP_REF calls away and added as early pass. Now when Richi fixed it, it should even be possible. However, can't this be also handled by type based code?
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #12 from Jan Hubicka 2010-12-15 00:10:45 UTC --- No, this is different, since OBJ_TYPE_REF is sitting here and it imply type conversion in the way we implement it right now. There is no type mismatch in between the original address and the call argument, just the intermediate pointer is of wrong type. VRP should do the same as CCP, see the code I added into CCP.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #11 from Martin Jambor 2010-12-14 23:35:29 UTC --- I believe Richi's comment #14 in PR 46076 applies here as well.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #10 from Jan Hubicka 2010-12-14 23:17:13 UTC --- Eh, wanted to paste: D.2236_10 = (int (*__vtbl_ptr_type) (void)) f; OBJ_TYPE_REF(D.2236_10;D.2215_3->2) (D.2215_3); I told ccp should IMO optimize it, but doesn't. I guess it is because VRP1 works out the constant but then it doesn't know about OBJ_TYPE_REF folding.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #9 from Jan Hubicka 2010-12-14 23:15:16 UTC --- OK, main() code seems to optimize out that is an imrovement. Is it optimized away with your patch pre-IPA too? Derived() is also devirtualizable: Derived::Derived() (struct Derived * const this) { int (*__vtbl_ptr_type) (void) * D.2237; int (*__vtbl_ptr_type) (void) D.2236; struct Base * D.2215; : D.2215_2 = &this_1(D)->D.2115; D.2215_2->_vptr.Base = &_ZTV4Base[2]; this_1(D)->D.2115._vptr.Base = &_ZTV7Derived[2]; D.2215_3 = &this_1(D)->D.2115; D.2237_9 = D.2215_3->_vptr.Base; D.2236_10 = MEM[(int (*__vtbl_ptr_type) (void) *)D.2237_9 + 16B]; OBJ_TYPE_REF(D.2236_10;D.2215_3->2) (D.2215_3); We fail to devirtualize pre-IPA. After IPA we compile into: : MEM[(struct Base *)this_1(D)]._vptr.Base = &_ZTV4Base[2]; this_1(D)->D.2115._vptr.Base = &_ZTV7Derived[2]; D.2215_3 = &this_1(D)->D.2115; D.2236_10 = (int (*__vtbl_ptr_type) (void)) f; a type mismatch?
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 Martin Jambor changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2010.12.14 17:45:52 Ever Confirmed|0 |1 --- Comment #8 from Martin Jambor 2010-12-14 17:45:52 UTC --- I've just confirmed that main of the testcase from the initial bug description is optimized to nothing even by just the early optimizers on trunk. My dynamic-type change detection patches postpone that a little bit, unfortunately (and inevitably) but the final result is the same. I believe we have testcases already for this. As far as the testcase from comment #5 is concerned, that is quite another matter because the object is dynamically allocated there. If the constructor is inlined, we may do this with improved folding of O_T_R according to its first parameter. If it is not, we would need to be able to track the object interprocedurally to verify nothing bad happens to it (like a call to a destructor followed by a call to placement new). And of course we would have to solve the "operator new is not malloc" problem.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #7 from Martin Jambor 2010-10-11 18:40:42 UTC --- And by the way, I'm afraid we really need to somehow address PR 45934 before we attempt to fold more O_T_Rs coming from constructors/destructors.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 Martin Jambor changed: What|Removed |Added CC|mjambor at suse dot cz |jamborm at gcc dot gnu.org --- Comment #6 from Martin Jambor 2010-10-11 17:15:02 UTC --- Hi, (In reply to comment #3) > Hmm, > normally we should see it from COMPONENT_REF: I don't understand the sentence above. > while (true) > { > if (TREE_CODE (ref) == COMPONENT_REF) > { > tree par_type; > tree binfo, base_binfo; > tree field = TREE_OPERAND (ref, 1); > > if (!DECL_ARTIFICIAL (field)) > { > tree type = TREE_TYPE (field); > if (TREE_CODE (type) == RECORD_TYPE) > return TYPE_BINFO (type); > else > return NULL_TREE; > } > but we don't since it has DECL_ARTIFICIAL set. What is the logic here? > Also what about i.e. ARRAY_REF and arrays of objects and COMPONENT_REFs > translated to MEM_REFs? See testcase g++.dg/ipa/ivinline-5.C. We have to differentiate between fields which represent ancestors and field which are really put there by the user. MEM_REFs would indeed pose a problem here, we would have to find the field by (a simpler version of) something like build_user_friendly_ref_for_offset in tree-sra.c. ARRAY_REFs cannot represent ancestors and so are not a problem. But you wrote the field is artificial so the code above should not be an issue. In fact, the code specifically does make sure it does not traverse BINFOs of the first ancestors with virtual methods because they do not have their own list of virtual methods (http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01458.html).
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #5 from Jan Hubicka 2010-09-25 22:17:41 UTC --- Another testcase where we devirtualize via folding is: // { dg-do assemble } // { dg-options "-g -O2" } // Copyright (C) 1999 Free Software Foundation, Inc. // Contributed by Nathan Sidwell 21 Nov 1999 // This causes assember relocation errors struct X { virtual ~X () {} }; struct Y { Y (){}; }; void foo () { X *x = new X; x->~X (); Y ys[2]; } compiled with -O2 we get x_3 = operator new (8); # DEBUG this => x_3 x_3->_vptr.X = &_ZTV1X[2]; # DEBUG x => x_3 D.2142_7 = (int (*__vtbl_ptr_type) (void)) __comp_dtor ; OBJ_TYPE_REF(D.2142_7;x_3->0) (x_3); that gets folded only in ccp3. We need FRE to fold: x_3->_vptr.X = &_ZTV1X[2]; # DEBUG x => x_3 D.2141_6 = &_ZTV1X[2]; D.2142_7 = *D.2141_6; OBJ_TYPE_REF(D.2142_7;x_3->0) (x_3); into x_3 = operator new (8); # DEBUG this => x_3 x_3->_vptr.X = &_ZTV1X[2]; # DEBUG x => x_3 D.2141_6 = x_3->_vptr.X; D.2142_7 = *D.2141_6; OBJ_TYPE_REF(D.2142_7;x_3->0) (x_3);
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #4 from Jan Hubicka 2010-09-25 22:12:39 UTC --- Note that the patch attached solves one indirect call in the testcase but has no effect on mozilla.
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 --- Comment #3 from Jan Hubicka 2010-09-25 22:10:15 UTC --- Hmm, normally we should see it from COMPONENT_REF: while (true) { if (TREE_CODE (ref) == COMPONENT_REF) { tree par_type; tree binfo, base_binfo; tree field = TREE_OPERAND (ref, 1); if (!DECL_ARTIFICIAL (field)) { tree type = TREE_TYPE (field); if (TREE_CODE (type) == RECORD_TYPE) return TYPE_BINFO (type); else return NULL_TREE; } but we don't since it has DECL_ARTIFICIAL set. What is the logic here? Also what about i.e. ARRAY_REF and arrays of objects and COMPONENT_REFs translated to MEM_REFs? Honza
[Bug tree-optimization/45791] Missed devirtualization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45791 Jan Hubicka changed: What|Removed |Added Blocks||45375 --- Comment #2 from Jan Hubicka 2010-09-25 16:52:25 UTC --- The 800 missed devirtualizations on Mozilla seems to be mostly AddRef that I can imagine is exactly this case of reference counting in constructor.