> On 24 Jun 2025, at 7:43 pm, Jan Hubicka <hubi...@ucw.cz> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> this pass removes early-inlining from afdo pass since all inlining
> should now happen from early inliner.  I tedted this on spec and there
> are 3 inlines happening here which are blocked at early-inline time by
> hitting large function growth limit.  We probably want to bypass that
> limit, I will look into that incrementaly.

Thanks for doing this. Is the inlining difference here is due to annotation 
that happens in auto-profile pass in the earlier implementation?

One unrelated question about scaling profiles. We seem to scale-up AFDO  with 
and_count_scale and scale down local_profile in some other cases. Should we 
instead scale up AFDO profile to local_profile scale. Lot of the inlining and 
other parameters seem to work well with that.

Thanks,
Kugan
> 
> This should make the non-inlined function profile merging hopefully
> easier.
> 
> It may still make sense to separate afdo inliner from early inliner to
> solve the non-transitivity issues which is not that hard to do with
> current code orgnaization. However this should be separate IPA pass
> rather then another part of afdo pass, since it can be coneptually
> separate.
> 
> Boostrapped/regtested x86_64-linux, will commit it shortly.

> 
> Honza
> 
> gcc/ChangeLog:
> 
>        * auto-profile.cc: Update toplevel comment.
>        (early_inline): Remove.
>        (auto_profile): Don't do early inlining.

> 
> diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
> index 8a1d9f878c6..3f8310e6324 100644
> --- a/gcc/auto-profile.cc
> +++ b/gcc/auto-profile.cc
> @@ -76,21 +76,30 @@ along with GCC; see the file COPYING3.  If not see
>      standalone symbol, or a clone of a function that is inlined into another
>      function.
> 
> -   Phase 2: Early inline + value profile transformation.
> -     Early inline uses autofdo_source_profile to find if a callsite is:
> +   Phase 2: AFDO inline + value profile transformation.
> +     This happens during early optimization.
> +     During early inlning AFDO inliner is executed which
> +     uses autofdo_source_profile to find if a callsite is:
>         * inlined in the profiled binary.
>         * callee body is hot in the profiling run.
>      If both condition satisfies, early inline will inline the callsite
>      regardless of the code growth.
> -     Phase 2 is an iterative process. During each iteration, we also check
> -     if an indirect callsite is promoted and inlined in the profiling run.
> -     If yes, vpt will happen to force promote it and in the next iteration,
> -     einline will inline the promoted callsite in the next iteration.
> +
> +     Performing this early has benefit of doing early optimizations
> +     before read IPA passe and getting more "context sensitivity" of
> +     the profile read.  Profile of inlined functions may differ
> +     significantly form one inline instance to another and from the
> +     offline version.
> +
> +     This is controlled by -fauto-profile-inlinig and is independent
> +     of -fearly-inlining.
> 
>    Phase 3: Annotate control flow graph.
>      AutoFDO uses a separate pass to:
>         * Annotate basic block count
>         * Estimate branch probability
> +       * Use earlier static profile to fill in the gaps
> +         if AFDO profile is ambigous
> 
>    After the above 3 phases, all profile is readily annotated on the GCC IR.
>    AutoFDO tries to reuse all FDO infrastructure as much as possible to make
> @@ -2217,18 +2226,6 @@ afdo_annotate_cfg (void)
>   free_dominance_info (CDI_POST_DOMINATORS);
> }
> 
> -/* Wrapper function to invoke early inliner.  */
> -
> -static unsigned int
> -early_inline ()
> -{
> -  compute_fn_summary (cgraph_node::get (current_function_decl), true);
> -  unsigned int todo = early_inliner (cfun);
> -  if (todo & TODO_update_ssa_any)
> -    update_ssa (TODO_update_ssa);
> -  return todo;
> -}
> -
> /* Use AutoFDO profile to annoate the control flow graph.
>    Return the todo flag.  */
> 
> @@ -2254,15 +2251,9 @@ auto_profile (void)
> 
>     push_cfun (DECL_STRUCT_FUNCTION (node->decl));
> 
> -    unsigned int todo = early_inline ();
>     autofdo::afdo_annotate_cfg ();
>     compute_function_frequency ();
> 
> -    /* Local pure-const may imply need to fixup the cfg.  */
> -    todo |= execute_fixup_cfg ();
> -    if (todo & TODO_cleanup_cfg)
> -      cleanup_tree_cfg ();
> -
>     free_dominance_info (CDI_DOMINATORS);
>     free_dominance_info (CDI_POST_DOMINATORS);
>     cgraph_edge::rebuild_edges ();

Reply via email to