On Tue, Nov 1, 2011 at 11:06 PM, Martin Jambor <mjam...@suse.cz> wrote:
> Hi,
>
> the patch below is the second (and last) revived type-based
> devirtualization patch that did not make it to 4.6.  It deals with
> virtual calls from the function in which the there is also the object
> declaration:
>
> void foo()
> {
>  A a;
>
>  a.foo ();
> }
>
> Normally, the front-end would do the devirtualization on its own,
> however, early-inlining can create these situations too.  Since there
> is nothing interprocedural going on, the current inlining and IPA-CP
> devirtualization bits are of no help.  We do not do type-based
> devirtualization in OBJ_TYPE_REF folding either, because the necessary
> type-changing checks might make it quite expensive.

But in the above case - a is not address-taken - we do not need
type-based devirtualization.  So, can you explain why we do not
forward the proper vtable from the vtable store that would lead to
the known-type info?

Thanks,
Richard.

>  Thus, this patch
> introduces a new pass to do that.
>
> The patch basically piggy-tails on the intraprocedural
> devirtualization mechanism, trying to construct a known-type jump
> function for all objects in OBJ_TYPE_REF calls and then immediately
> using it like we do in IPA-CP.
>
> The original patch was doing this as a part of
> pass_rebuild_cgraph_edges.  Honza did not like this idea so I made it
> a separate pass.  First, I scheduled it after
> pass_rebuild_cgraph_edges and was only traversing indirect edges,
> avoiding a sweep over all of the IL.  Unfortunately, this does not
> work in one scenario.  If the newly-known destination of a virtual
> call is known not to throw, we may have to purge some EH CFG edges and
> potentially basic blocks.  If these basic blocks contain calls
> (typically calls to object destructors), we may end up having stale
> call edges in the call graph... and our current approach to that
> problem is to call pass_rebuild_cgraph_edges.  I think that I was not
> running into this problem when the mechanism was a part of that pass
> just because of pure luck.  Anyway, this is why I eventually opted for
> a sweep over the statements.
>
> My best guess is that it is probably worth it, but only because the
> overhead should be still fairly low.  The pass triggers quite a number
> of times when building libstdc++ and it can speed up an artificial
> testcase which I will attach from over 20 seconds to 7s on my older
> desktop - it is very similar to the one I posted with the previous
> patch but this time the object constructors must not get early inlined
> but the function multiply_matrices has to.  Currently I have problems
> compiling Firefox even without LTO so I don't have any numbers from it
> either.  IIRC, Honza did not see this patch trigger there when he
> tried the ancient version almost a year go.  On the other hand, Maxim
> claimed that the impact can be noticeable on some code base he is
> concerned about.
>
> I have successfully bootstrapped and tested the patch on x86_64-linux.
> What do you think, should we include this in trunk?
>
> Thanks,
>
> Martin
>
>
> 2011-10-31  Martin Jambor  <mjam...@suse.cz>
>
>        * ipa-cp.c (ipa_value_from_known_type_jfunc): Moved to...
>        * ipa-prop.c (ipa_binfo_from_known_type_jfunc): ...here, exported and
>        updated all callers.
>        (intraprocedural_devirtualization): New function.
>        (gate_intra_devirtualization): Likewise.
>        (pass_intra_devirt): New pass.
>        * ipa-prop.h (ipa_binfo_from_known_type_jfunc): Declared.
>        * passes.c (init_optimization_passes): Schedule pass_intra_devirt.
>        * tree-pass.h (pass_intra_devirt): Declare.
>
>        * testsuite/g++.dg/ipa/imm-devirt-1.C: New test.
>        * testsuite/g++.dg/ipa/imm-devirt-2.C: Likewise.
>
>
> Index: src/gcc/testsuite/g++.dg/ipa/imm-devirt-1.C
> ===================================================================
> --- /dev/null
> +++ src/gcc/testsuite/g++.dg/ipa/imm-devirt-1.C
> @@ -0,0 +1,62 @@
> +/* Verify that virtual calls are folded even when a typecast to an
> +   ancestor is involved along the way.  */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-devirt"  } */
> +
> +extern "C" void abort (void);
> +
> +class A
> +{
> +public:
> +  int data;
> +  virtual int foo (int i);
> +};
> +
> +
> +class B : public A
> +{
> +public:
> +  __attribute__ ((noinline)) B();
> +  virtual int foo (int i);
> +};
> +
> +int __attribute__ ((noinline)) A::foo (int i)
> +{
> +  return i + 1;
> +}
> +
> +int __attribute__ ((noinline)) B::foo (int i)
> +{
> +  return i + 2;
> +}
> +
> +int __attribute__ ((noinline,noclone)) get_input(void)
> +{
> +  return 1;
> +}
> +
> +__attribute__ ((noinline)) B::B()
> +{
> +}
> +
> +static inline int middleman_1 (class A *obj, int i)
> +{
> +  return obj->foo (i);
> +}
> +
> +static inline int middleman_2 (class B *obj, int i)
> +{
> +  return middleman_1 (obj, i);
> +}
> +
> +int main (int argc, char *argv[])
> +{
> +  class B b;
> +
> +  if (middleman_2 (&b, get_input ()) != 3)
> +    abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Immediately devirtualizing 
> call.*into.*B::foo"  "devirt"  } } */
> +/* { dg-final { cleanup-tree-dump "devirt" } } */
> Index: src/gcc/testsuite/g++.dg/ipa/imm-devirt-2.C
> ===================================================================
> --- /dev/null
> +++ src/gcc/testsuite/g++.dg/ipa/imm-devirt-2.C
> @@ -0,0 +1,95 @@
> +/* Verify that virtual calls are folded even when a typecast to an
> +   ancestor is involved along the way.  */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-devirt-slim"  } */
> +
> +extern "C" void abort (void);
> +
> +class Distraction
> +{
> +public:
> +  float f;
> +  double d;
> +  Distraction ()
> +  {
> +    f = 8.3;
> +    d = 10.2;
> +  }
> +  virtual float bar (float z);
> +};
> +
> +class A
> +{
> +public:
> +  int data;
> +  virtual int foo (int i);
> +};
> +
> +class B : public A
> +{
> +public:
> +  int data_2;
> +  virtual int foo (int i);
> +  virtual int baz (int i);
> +};
> +
> +
> +class C : public Distraction, public B
> +{
> +public:
> +  __attribute__ ((noinline)) C();
> +  virtual int foo (int i);
> +};
> +
> +float __attribute__ ((noinline)) Distraction::bar (float z)
> +{
> +  f += z;
> +  return f/2;
> +}
> +
> +int __attribute__ ((noinline)) A::foo (int i)
> +{
> +  return i + 1;
> +}
> +
> +int __attribute__ ((noinline)) B::foo (int i)
> +{
> +  return i + 2;
> +}
> +
> +int __attribute__ ((noinline)) B::baz (int i)
> +{
> +  return i * 15;
> +}
> +
> +int __attribute__ ((noinline)) C::foo (int i)
> +{
> +  return i + 3;
> +}
> +
> +int __attribute__ ((noinline,noclone)) get_input(void)
> +{
> +  return 1;
> +}
> +
> +static inline int middleman (class A *obj, int i)
> +{
> +  return obj->foo (i);
> +}
> +
> +__attribute__ ((noinline)) C::C()
> +{
> +}
> +
> +int main (int argc, char *argv[])
> +{
> +  class C c;
> +
> +  if (middleman (&c, get_input ()) != 4)
> +    abort ();
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Immediately devirtualizing 
> call.*into.*C::.*foo"  "devirt"  } } */
> +/* { dg-final { cleanup-tree-dump "devirt" } } */
> Index: src/gcc/ipa-cp.c
> ===================================================================
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -674,20 +674,6 @@ ipa_get_jf_ancestor_result (struct ipa_j
>     return NULL_TREE;
>  }
>
> -/* Extract the acual BINFO being described by JFUNC which must be a known 
> type
> -   jump function.  */
> -
> -static tree
> -ipa_value_from_known_type_jfunc (struct ipa_jump_func *jfunc)
> -{
> -  tree base_binfo = TYPE_BINFO (jfunc->value.known_type.base_type);
> -  if (!base_binfo)
> -    return NULL_TREE;
> -  return get_binfo_at_offset (base_binfo,
> -                             jfunc->value.known_type.offset,
> -                             jfunc->value.known_type.component_type);
> -}
> -
>  /* Determine whether JFUNC evaluates to a known value (that is either a
>    constant or a binfo) and if so, return it.  Otherwise return NULL. INFO
>    describes the caller node so that pass-through jump functions can be
> @@ -699,7 +685,7 @@ ipa_value_from_jfunc (struct ipa_node_pa
>   if (jfunc->type == IPA_JF_CONST)
>     return jfunc->value.constant;
>   else if (jfunc->type == IPA_JF_KNOWN_TYPE)
> -    return ipa_value_from_known_type_jfunc (jfunc);
> +    return ipa_binfo_from_known_type_jfunc (jfunc);
>   else if (jfunc->type == IPA_JF_PASS_THROUGH
>           || jfunc->type == IPA_JF_ANCESTOR)
>     {
> @@ -1006,7 +992,7 @@ propagate_accross_jump_function (struct
>
>       if (jfunc->type == IPA_JF_KNOWN_TYPE)
>        {
> -         val = ipa_value_from_known_type_jfunc (jfunc);
> +         val = ipa_binfo_from_known_type_jfunc (jfunc);
>          if (!val)
>            return set_lattice_contains_variable (dest_lat);
>        }
> Index: src/gcc/ipa-prop.c
> ===================================================================
> --- src.orig/gcc/ipa-prop.c
> +++ src/gcc/ipa-prop.c
> @@ -3069,3 +3069,107 @@ ipa_update_after_lto_read (void)
>     if (node->analyzed)
>       ipa_initialize_node_params (node);
>  }
> +
> +/* Extract the acual BINFO being described by JFUNC which must be a known 
> type
> +   jump function.  */
> +
> +tree
> +ipa_binfo_from_known_type_jfunc (struct ipa_jump_func *jfunc)
> +{
> +  tree base_binfo = TYPE_BINFO (jfunc->value.known_type.base_type);
> +  if (!base_binfo)
> +    return NULL_TREE;
> +  return get_binfo_at_offset (base_binfo,
> +                             jfunc->value.known_type.offset,
> +                             jfunc->value.known_type.component_type);
> +}
> +
> +/* Intraprocedural (early) type-based devirtualization pass.  Looks at all 
> call
> +   statements and attempts type-base devirtualization on those calling an
> +   OBJ_TYPE_REF.  */
> +
> +static unsigned int
> +intraprocedural_devirtualization (void)
> +{
> +  basic_block bb;
> +  gimple_stmt_iterator gsi;
> +  bool cfg_changed = false;
> +
> +  FOR_EACH_BB (bb)
> +    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +      if (is_gimple_call (gsi_stmt (gsi)))
> +       {
> +         tree binfo, token, fndecl;
> +         gimple call = gsi_stmt (gsi);
> +         tree target = gimple_call_fn (call);
> +         struct ipa_jump_func jfunc;
> +
> +         if (TREE_CODE (target) != OBJ_TYPE_REF)
> +           continue;
> +
> +         jfunc.type = IPA_JF_UNKNOWN;
> +         compute_known_type_jump_func (OBJ_TYPE_REF_OBJECT (target), &jfunc,
> +                                       call);
> +         if (jfunc.type != IPA_JF_KNOWN_TYPE)
> +           continue;
> +         binfo = ipa_binfo_from_known_type_jfunc (&jfunc);
> +         if (!binfo)
> +           continue;
> +         token = OBJ_TYPE_REF_TOKEN (target);
> +         fndecl = gimple_get_virt_method_for_binfo (tree_low_cst (token, 1),
> +                                                    binfo);
> +         if (!fndecl)
> +           continue;
> +
> +         if (dump_file)
> +           {
> +             fprintf (dump_file, "ipa-prop: Immediately devirtualizing call 
> ");
> +             print_gimple_expr (dump_file, call, 0, 0);
> +           }
> +
> +         gimple_call_set_fndecl (call, fndecl);
> +         update_stmt (call);
> +         if (maybe_clean_eh_stmt (call)
> +             && gimple_purge_dead_eh_edges (bb))
> +           cfg_changed = true;
> +
> +         if (dump_file)
> +           {
> +             fprintf (dump_file, " into ");
> +             print_gimple_expr (dump_file, call, 0, 0);
> +             fprintf (dump_file, "\n");
> +           }
> +       }
> +
> +
> + if (cfg_changed)
> +    return TODO_cleanup_cfg;
> +  else
> +    return 0;
> +}
> +
> +static bool
> +gate_intra_devirtualization (void)
> +{
> +  return flag_devirtualize != 0;
> +}
> +
> +
> +struct gimple_opt_pass pass_intra_devirt =
> +  {
> +    {
> +      GIMPLE_PASS,
> +      "devirt",                                        /* name */
> +      gate_intra_devirtualization,             /* gate */
> +      intraprocedural_devirtualization,        /* execute */
> +      NULL,                                    /* sub */
> +      NULL,                                    /* next */
> +      0,                                       /* static_pass_number */
> +      TV_CGRAPH,                               /* tv_id */
> +      PROP_cfg,                                        /* 
> properties_required */
> +      0,                                       /* properties_provided */
> +      0,                                       /* properties_destroyed */
> +      0,                                       /* todo_flags_start */
> +      0,                                       /* todo_flags_finish */
> +    }
> +  };
> Index: src/gcc/ipa-prop.h
> ===================================================================
> --- src.orig/gcc/ipa-prop.h
> +++ src/gcc/ipa-prop.h
> @@ -347,6 +347,7 @@ bool ipa_propagate_indirect_call_infos (
>
>  /* Indirect edge and binfo processing.  */
>  struct cgraph_edge *ipa_make_edge_direct_to_target (struct cgraph_edge *, 
> tree);
> +tree ipa_binfo_from_known_type_jfunc (struct ipa_jump_func *);
>
>  /* Functions related to both.  */
>  void ipa_analyze_node (struct cgraph_node *);
> Index: src/gcc/passes.c
> ===================================================================
> --- src.orig/gcc/passes.c
> +++ src/gcc/passes.c
> @@ -1225,6 +1225,7 @@ init_optimization_passes (void)
>           NEXT_PASS (pass_cleanup_eh);
>           NEXT_PASS (pass_profile);
>           NEXT_PASS (pass_local_pure_const);
> +         NEXT_PASS (pass_intra_devirt);
>          /* Split functions creates parts that are not run through
>             early optimizations again.  It is thus good idea to do this
>             late.  */
> Index: src/gcc/tree-pass.h
> ===================================================================
> --- src.orig/gcc/tree-pass.h
> +++ src/gcc/tree-pass.h
> @@ -449,6 +449,7 @@ extern struct gimple_opt_pass pass_trace
>  extern struct gimple_opt_pass pass_warn_unused_result;
>  extern struct gimple_opt_pass pass_split_functions;
>  extern struct gimple_opt_pass pass_feedback_split_functions;
> +extern struct gimple_opt_pass pass_intra_devirt;
>
>  /* IPA Passes */
>  extern struct simple_ipa_opt_pass pass_ipa_lower_emutls;
>

Reply via email to