https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |mliska at suse dot cz
           Assignee|unassigned at gcc dot gnu.org      |hubicka at gcc dot 
gnu.org

--- Comment #16 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I have moved tracer before the late cleanups that seems to be rather obbious
thing to do. This lets us to optimize the testcase (with -O2):
int main() ()
{
  struct A * ap;
  int i;
  int _6;

  <bb 2>:

  <bb 3>:
  # i_29 = PHI <i_22(6), 0(2)>
  _6 = i_29 % 7;
  if (_6 == 0)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  ap_8 = operator new (16);
  ap_8->i = 0;
  ap_8->_vptr.A = &MEM[(void *)&_ZTV1A + 16B];
  goto <bb 6>;

  <bb 5>:
  ap_13 = operator new (16);
  MEM[(struct B *)ap_13].D.2244.i = 0;
  MEM[(struct B *)ap_13].b = 0;
  MEM[(struct B *)ap_13].D.2244._vptr.A = &MEM[(void *)&_ZTV1B + 16B];

  <bb 6>:
  # ap_4 = PHI <ap_13(5), ap_8(4)>
  operator delete (ap_4);
  i_22 = i_29 + 1;
  if (i_22 != 10000)
    goto <bb 3>;
  else
    goto <bb 7>;

  <bb 7>:
  return 0;

}

Martin, I do not have SPEC setup, do you think you can benchmark the attached
patch with SPEC and profile feedback and also non-FDO -O3 -ftracer compared to
-O3, please?
It would be nice to know code size impact, too.
Index: passes.def
===================================================================
--- passes.def  (revision 215651)
+++ passes.def  (working copy)
@@ -155,6 +155,7 @@ along with GCC; see the file COPYING3.
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_call_cdce);
       NEXT_PASS (pass_cselim);
+      NEXT_PASS (pass_tracer);
       NEXT_PASS (pass_copy_prop);
       NEXT_PASS (pass_tree_ifcombine);
       NEXT_PASS (pass_phiopt);
@@ -252,7 +253,6 @@ along with GCC; see the file COPYING3.
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
-      NEXT_PASS (pass_tracer);
       NEXT_PASS (pass_dominator);
       NEXT_PASS (pass_strlen);
       NEXT_PASS (pass_vrp);

Doing it at same approximately the same place as loop header copying seems to
make most sense to me.  It benefits from early cleanups and DCE definitly and
it should enable more fun with the later scalar passes that are almost all
rerun then.

Reply via email to