Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-19 Thread Jan Hubicka
> On Mon, May 19, 2014 at 1:40 PM, Jan Hubicka  wrote:
> >> I've updated the patch. Shall I move the check inside cgraph_clone_node?
> >
> > Thanks,
> > I think it is OK as it is. I belive individual users should know what do to
> > in such cases themselves.
> > You may want to also check what ipa-cp is doing.
> 
> I checked ipa-cp, but didn't see count propagation anywhere. Could you
> point me to the function?

I believe it is done in update_specialized_profile and also via 
cgraph_create_virtual_clone

honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-19 Thread Dehao Chen
On Mon, May 19, 2014 at 1:40 PM, Jan Hubicka  wrote:
>> I've updated the patch. Shall I move the check inside cgraph_clone_node?
>
> Thanks,
> I think it is OK as it is. I belive individual users should know what do to
> in such cases themselves.
> You may want to also check what ipa-cp is doing.

I checked ipa-cp, but didn't see count propagation anywhere. Could you
point me to the function?

Thanks,
Dehao

>
> Patch is OK (with Changelog)
> Honza
>>
>> Thanks,
>> Dehao
>>
>> Index: gcc/ipa-inline-transform.c
>> ===
>> --- gcc/ipa-inline-transform.c (revision 210535)
>> +++ gcc/ipa-inline-transform.c (working copy)
>> @@ -183,8 +183,9 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>if (freq_scale == -1)
>>  freq_scale = e->frequency;
>>n = cgraph_clone_node (e->callee, e->callee->decl,
>> - e->count, freq_scale, update_original,
>> - vNULL, true, inlining_into, NULL);
>> + MIN (e->count, e->callee->count), freq_scale,
>> + update_original, vNULL, true, inlining_into,
>> + NULL);
>>cgraph_redirect_edge_callee (e, n);
>>   }
>>  }
>> Index: gcc/tree-inline.c
>> ===
>> --- gcc/tree-inline.c (revision 210535)
>> +++ gcc/tree-inline.c (working copy)
>> @@ -4355,7 +4355,7 @@ expand_call_inline (basic_block bb, gimple stmt, c
>>   function in any way before this point, as this CALL_EXPR may be
>>   a self-referential call; if we're calling ourselves, we need to
>>   duplicate our body before altering anything.  */
>> -  copy_body (id, bb->count,
>> +  copy_body (id, cg_edge->callee->count,
>> GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
>>   bb, return_block, NULL);
>>
>>
>> On Fri, May 16, 2014 at 6:41 PM, Jan Hubicka  wrote:
>> >> Do you mean adjusting bb->count? Because in
>> >> expand_call_inline(tree-inline.c), it will use bb->count to pass into
>> >> copy_body to calculate count_scale.
>> >
>> > What about taking here callee->count instead? For inline nodes without
>> > any capping hack, bb->count == edge->count = callee->count.
>> >
>> > When profile ends up being obviously inconsistent, I would say that
>> > inliner can cap callee->count during producing the clone...
>> >
>> > Honza
>> >>
>> >> Thanks,
>> >> Dehao
>> >>
>> >> On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka  wrote:
>> >> >> In AutoFDO, a basic block's count can be much larger than it's actual
>> >> >> count because debug info might be incorrect. In this case, a call edge
>> >> >> count (calculated from BB count) could be much larger than callee's
>> >> >> header count, making the count_scale incorrectly large.
>> >> >
>> >> > In this case I still think we should handle this when producing the 
>> >> > clone:
>> >> > we do not want to have clone's count much larger as well, so i think 
>> >> > inliner
>> >> > and ipa-cp needs to deal with capping here instead
>> >> >
>> >> > Honza
>> >> >>
>> >> >> Dehao
>> >> >> >
>> >> >> >
>> >> >> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-19 Thread Jan Hubicka
> I've updated the patch. Shall I move the check inside cgraph_clone_node?

Thanks,
I think it is OK as it is. I belive individual users should know what do to
in such cases themselves.
You may want to also check what ipa-cp is doing.

Patch is OK (with Changelog)
Honza
> 
> Thanks,
> Dehao
> 
> Index: gcc/ipa-inline-transform.c
> ===
> --- gcc/ipa-inline-transform.c (revision 210535)
> +++ gcc/ipa-inline-transform.c (working copy)
> @@ -183,8 +183,9 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>if (freq_scale == -1)
>  freq_scale = e->frequency;
>n = cgraph_clone_node (e->callee, e->callee->decl,
> - e->count, freq_scale, update_original,
> - vNULL, true, inlining_into, NULL);
> + MIN (e->count, e->callee->count), freq_scale,
> + update_original, vNULL, true, inlining_into,
> + NULL);
>cgraph_redirect_edge_callee (e, n);
>   }
>  }
> Index: gcc/tree-inline.c
> ===
> --- gcc/tree-inline.c (revision 210535)
> +++ gcc/tree-inline.c (working copy)
> @@ -4355,7 +4355,7 @@ expand_call_inline (basic_block bb, gimple stmt, c
>   function in any way before this point, as this CALL_EXPR may be
>   a self-referential call; if we're calling ourselves, we need to
>   duplicate our body before altering anything.  */
> -  copy_body (id, bb->count,
> +  copy_body (id, cg_edge->callee->count,
> GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
>   bb, return_block, NULL);
> 
> 
> On Fri, May 16, 2014 at 6:41 PM, Jan Hubicka  wrote:
> >> Do you mean adjusting bb->count? Because in
> >> expand_call_inline(tree-inline.c), it will use bb->count to pass into
> >> copy_body to calculate count_scale.
> >
> > What about taking here callee->count instead? For inline nodes without
> > any capping hack, bb->count == edge->count = callee->count.
> >
> > When profile ends up being obviously inconsistent, I would say that
> > inliner can cap callee->count during producing the clone...
> >
> > Honza
> >>
> >> Thanks,
> >> Dehao
> >>
> >> On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka  wrote:
> >> >> In AutoFDO, a basic block's count can be much larger than it's actual
> >> >> count because debug info might be incorrect. In this case, a call edge
> >> >> count (calculated from BB count) could be much larger than callee's
> >> >> header count, making the count_scale incorrectly large.
> >> >
> >> > In this case I still think we should handle this when producing the 
> >> > clone:
> >> > we do not want to have clone's count much larger as well, so i think 
> >> > inliner
> >> > and ipa-cp needs to deal with capping here instead
> >> >
> >> > Honza
> >> >>
> >> >> Dehao
> >> >> >
> >> >> >
> >> >> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-19 Thread Dehao Chen
I've updated the patch. Shall I move the check inside cgraph_clone_node?

Thanks,
Dehao

Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c (revision 210535)
+++ gcc/ipa-inline-transform.c (working copy)
@@ -183,8 +183,9 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   if (freq_scale == -1)
 freq_scale = e->frequency;
   n = cgraph_clone_node (e->callee, e->callee->decl,
- e->count, freq_scale, update_original,
- vNULL, true, inlining_into, NULL);
+ MIN (e->count, e->callee->count), freq_scale,
+ update_original, vNULL, true, inlining_into,
+ NULL);
   cgraph_redirect_edge_callee (e, n);
  }
 }
Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c (revision 210535)
+++ gcc/tree-inline.c (working copy)
@@ -4355,7 +4355,7 @@ expand_call_inline (basic_block bb, gimple stmt, c
  function in any way before this point, as this CALL_EXPR may be
  a self-referential call; if we're calling ourselves, we need to
  duplicate our body before altering anything.  */
-  copy_body (id, bb->count,
+  copy_body (id, cg_edge->callee->count,
GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
  bb, return_block, NULL);


On Fri, May 16, 2014 at 6:41 PM, Jan Hubicka  wrote:
>> Do you mean adjusting bb->count? Because in
>> expand_call_inline(tree-inline.c), it will use bb->count to pass into
>> copy_body to calculate count_scale.
>
> What about taking here callee->count instead? For inline nodes without
> any capping hack, bb->count == edge->count = callee->count.
>
> When profile ends up being obviously inconsistent, I would say that
> inliner can cap callee->count during producing the clone...
>
> Honza
>>
>> Thanks,
>> Dehao
>>
>> On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka  wrote:
>> >> In AutoFDO, a basic block's count can be much larger than it's actual
>> >> count because debug info might be incorrect. In this case, a call edge
>> >> count (calculated from BB count) could be much larger than callee's
>> >> header count, making the count_scale incorrectly large.
>> >
>> > In this case I still think we should handle this when producing the clone:
>> > we do not want to have clone's count much larger as well, so i think 
>> > inliner
>> > and ipa-cp needs to deal with capping here instead
>> >
>> > Honza
>> >>
>> >> Dehao
>> >> >
>> >> >
>> >> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-16 Thread Jan Hubicka
> Do you mean adjusting bb->count? Because in
> expand_call_inline(tree-inline.c), it will use bb->count to pass into
> copy_body to calculate count_scale.

What about taking here callee->count instead? For inline nodes without
any capping hack, bb->count == edge->count = callee->count.

When profile ends up being obviously inconsistent, I would say that
inliner can cap callee->count during producing the clone...

Honza
> 
> Thanks,
> Dehao
> 
> On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka  wrote:
> >> In AutoFDO, a basic block's count can be much larger than it's actual
> >> count because debug info might be incorrect. In this case, a call edge
> >> count (calculated from BB count) could be much larger than callee's
> >> header count, making the count_scale incorrectly large.
> >
> > In this case I still think we should handle this when producing the clone:
> > we do not want to have clone's count much larger as well, so i think inliner
> > and ipa-cp needs to deal with capping here instead
> >
> > Honza
> >>
> >> Dehao
> >> >
> >> >
> >> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-16 Thread Dehao Chen
Do you mean adjusting bb->count? Because in
expand_call_inline(tree-inline.c), it will use bb->count to pass into
copy_body to calculate count_scale.

Thanks,
Dehao

On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka  wrote:
>> In AutoFDO, a basic block's count can be much larger than it's actual
>> count because debug info might be incorrect. In this case, a call edge
>> count (calculated from BB count) could be much larger than callee's
>> header count, making the count_scale incorrectly large.
>
> In this case I still think we should handle this when producing the clone:
> we do not want to have clone's count much larger as well, so i think inliner
> and ipa-cp needs to deal with capping here instead
>
> Honza
>>
>> Dehao
>> >
>> >
>> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-16 Thread Jan Hubicka
> In AutoFDO, a basic block's count can be much larger than it's actual
> count because debug info might be incorrect. In this case, a call edge
> count (calculated from BB count) could be much larger than callee's
> header count, making the count_scale incorrectly large.

In this case I still think we should handle this when producing the clone:
we do not want to have clone's count much larger as well, so i think inliner
and ipa-cp needs to deal with capping here instead

Honza
> 
> Dehao
> >
> >
> > Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-16 Thread Dehao Chen
On Fri, May 16, 2014 at 4:41 PM, Jan Hubicka  wrote:
>
> > Is this patch ok for trunk? Bootstrapped and regression test on-going.
> >
> > Thanks,
> > Dehao
> >
> > 2014-05-16  Dehao Chen  
> >
> > * tree-inline.c (initialize_cfun): Ensure count_scale is no larger
> > than REG_BR_PROB_BASE.
> > (copy_cfg_body): Likewise.
>
> This seems like wrong place to paper around the problem - symmetric count
> scaling is done during production of the inline clone.  I think if we want to
> be smart about broken profiles, we should do it at that place instead here
> at inliner...
>
> What kind of problem does this patch solve?


In AutoFDO, a basic block's count can be much larger than it's actual
count because debug info might be incorrect. In this case, a call edge
count (calculated from BB count) could be much larger than callee's
header count, making the count_scale incorrectly large.

Dehao
>
>
> Honza


Re: [PATCH] Ensure count_scale is no larger than REG_BR_PROB_BASE

2014-05-16 Thread Jan Hubicka
> Is this patch ok for trunk? Bootstrapped and regression test on-going.
> 
> Thanks,
> Dehao
> 
> 2014-05-16  Dehao Chen  
> 
> * tree-inline.c (initialize_cfun): Ensure count_scale is no larger
> than REG_BR_PROB_BASE.
> (copy_cfg_body): Likewise.

This seems like wrong place to paper around the problem - symmetric count
scaling is done during production of the inline clone.  I think if we want to
be smart about broken profiles, we should do it at that place instead here
at inliner...

What kind of problem does this patch solve?

Honza