Re: Propagate profile counts after switch case expansion (issue5896043)

2012-04-16 Thread Easwaran Raman
Ping.

On Mon, Apr 9, 2012 at 2:33 PM, Easwaran Raman  wrote:
> On Sun, Mar 25, 2012 at 9:40 PM, Easwaran Raman  wrote:
>> On Sun, Mar 25, 2012 at 12:22 PM, Jan Hubicka  wrote:
 This patch propagates execution count of thee case labels of a
 switch-case statement after its expansion. Bootstraps and all
 tests pass. OK for trunk?
>>>
>>> Hi,
>>> while this is resonable thing to do, I belive it would make most sense
>>> to make switch lowering at gimple level: i.e. reorganize the existing
>>> code to produce gimple statement and change expansion code to produce
>>> tablejump from every gimple switch statement that was left in the
>>> code.
>>>
>>> This would be both cleaner and would allow gimple optimizers to improve the
>>> generated code. Incrementally it would also allow us to improve switch 
>>> exansion
>>> that is quite baroque and not realy producing very good code in some common
>>> cases.
>>>
>>> If you would be interested working on this (i.e. reorganizing the expansion
>>> code to produce gimple), I would be very happy. If not, I can review the
>>> profile updating part for mainline, since in any case this is desriable 
>>> thing
>>> to do.
>>
>> I am planning to explore improvements to switch expansion (peeling
>> some cases and using jump tables for the rest, for example) and I
>> think the reorganization you suggest is the right way to do such
>> improvements. But until I can spend time on it and get it done, I
>> would like this patch to get reviewed and checked in.
>>
>> Thanks,
>> Easwaran
>
> Ping.


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-04-09 Thread Easwaran Raman
On Sun, Mar 25, 2012 at 9:40 PM, Easwaran Raman  wrote:
> On Sun, Mar 25, 2012 at 12:22 PM, Jan Hubicka  wrote:
>>> This patch propagates execution count of thee case labels of a
>>> switch-case statement after its expansion. Bootstraps and all
>>> tests pass. OK for trunk?
>>
>> Hi,
>> while this is resonable thing to do, I belive it would make most sense
>> to make switch lowering at gimple level: i.e. reorganize the existing
>> code to produce gimple statement and change expansion code to produce
>> tablejump from every gimple switch statement that was left in the
>> code.
>>
>> This would be both cleaner and would allow gimple optimizers to improve the
>> generated code. Incrementally it would also allow us to improve switch 
>> exansion
>> that is quite baroque and not realy producing very good code in some common
>> cases.
>>
>> If you would be interested working on this (i.e. reorganizing the expansion
>> code to produce gimple), I would be very happy. If not, I can review the
>> profile updating part for mainline, since in any case this is desriable thing
>> to do.
>
> I am planning to explore improvements to switch expansion (peeling
> some cases and using jump tables for the rest, for example) and I
> think the reorganization you suggest is the right way to do such
> improvements. But until I can spend time on it and get it done, I
> would like this patch to get reviewed and checked in.
>
> Thanks,
> Easwaran

Ping.


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-25 Thread Easwaran Raman
On Sun, Mar 25, 2012 at 12:22 PM, Jan Hubicka  wrote:
>> This patch propagates execution count of thee case labels of a
>> switch-case statement after its expansion. Bootstraps and all
>> tests pass. OK for trunk?
>
> Hi,
> while this is resonable thing to do, I belive it would make most sense
> to make switch lowering at gimple level: i.e. reorganize the existing
> code to produce gimple statement and change expansion code to produce
> tablejump from every gimple switch statement that was left in the
> code.
>
> This would be both cleaner and would allow gimple optimizers to improve the
> generated code. Incrementally it would also allow us to improve switch 
> exansion
> that is quite baroque and not realy producing very good code in some common
> cases.
>
> If you would be interested working on this (i.e. reorganizing the expansion
> code to produce gimple), I would be very happy. If not, I can review the
> profile updating part for mainline, since in any case this is desriable thing
> to do.

I am planning to explore improvements to switch expansion (peeling
some cases and using jump tables for the rest, for example) and I
think the reorganization you suggest is the right way to do such
improvements. But until I can spend time on it and get it done, I
would like this patch to get reviewed and checked in.

Thanks,
Easwaran

> Honza


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-25 Thread Jan Hubicka
> This patch propagates execution count of thee case labels of a
> switch-case statement after its expansion. Bootstraps and all
> tests pass. OK for trunk?

Hi,
while this is resonable thing to do, I belive it would make most sense
to make switch lowering at gimple level: i.e. reorganize the existing
code to produce gimple statement and change expansion code to produce
tablejump from every gimple switch statement that was left in the
code.

This would be both cleaner and would allow gimple optimizers to improve the
generated code. Incrementally it would also allow us to improve switch exansion
that is quite baroque and not realy producing very good code in some common
cases.

If you would be interested working on this (i.e. reorganizing the expansion
code to produce gimple), I would be very happy. If not, I can review the
profile updating part for mainline, since in any case this is desriable thing
to do.

Honza


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-23 Thread Andi Kleen
> Do you mean use the weights to decide the shape of the binary tree

Yes.

> (similar to COST_TABLE heuristic)?

COST_TABLE should die I hope.

> I am planning to send a separate patch for that. 

Great.

-Andi


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-23 Thread Easwaran Raman
On Fri, Mar 23, 2012 at 3:29 PM, Andi Kleen  wrote:
> Easwaran Raman  writes:
>
>> Some more background on this patch: Right now, while the execution
>> counts of different case labels of a switch statement are obtained
>> during profile collection, they are not propagated to RTL. Instead,
>> counts are regenerated at the RTL level using static heuristics that
>> tend to weigh branches equally which can cause poor optimization of
>> hot code. This patch ensures that the counts collected during profile
>> collection are correctly propagated allowing hot code to be better
>> optimized by RTL optimizations.  Patch tested on x86_64.
>
> I think your patch doesn't use the probably to weight the decision
> tree for non tablejump, right? I looked at this some time ago,
> but the patch always had problems.

Do you mean use the weights to decide the shape of the binary tree
(similar to COST_TABLE heuristic)? I am planning to send a separate
patch for that. This one just makes sure that the profile counts are
propagated correctly. So you will still have a situation where a
branch corresponding to an infrequently executed case dominates a
frequently executed case, but the BB of the cases gets the right
profile weight.

- Easwaran

> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-23 Thread Andi Kleen
Easwaran Raman  writes:

> Some more background on this patch: Right now, while the execution
> counts of different case labels of a switch statement are obtained
> during profile collection, they are not propagated to RTL. Instead,
> counts are regenerated at the RTL level using static heuristics that
> tend to weigh branches equally which can cause poor optimization of
> hot code. This patch ensures that the counts collected during profile
> collection are correctly propagated allowing hot code to be better
> optimized by RTL optimizations.  Patch tested on x86_64.

I think your patch doesn't use the probably to weight the decision 
tree for non tablejump, right? I looked at this some time ago,
but the patch always had problems.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: Propagate profile counts after switch case expansion (issue5896043)

2012-03-23 Thread Easwaran Raman
Some more background on this patch: Right now, while the execution
counts of different case labels of a switch statement are obtained
during profile collection, they are not propagated to RTL. Instead,
counts are regenerated at the RTL level using static heuristics that
tend to weigh branches equally which can cause poor optimization of
hot code. This patch ensures that the counts collected during profile
collection are correctly propagated allowing hot code to be better
optimized by RTL optimizations.  Patch tested on x86_64.

- Easwaran

On Fri, Mar 23, 2012 at 10:43 AM, Easwaran Raman  wrote:
> This patch propagates execution count of thee case labels of a
> switch-case statement after its expansion. Bootstraps and all
> tests pass. OK for trunk?
>
> 2012-03-23   Easwaran Raman  
>
>        * cfgbuild.c (non_zero_profile_counts): New function.
>        (compute_outgoing_frequencies): If at least one successor of a
>        BB has non-zero profile count, retain the counts.
>        * expr.c (do_tablejump): Add a REG_BR_PROB note on the
>        jump to default label.
>        (try_tablejump): Add a parameter to specify the probability
>        of jumping to the default label.
>        * expr.h (try_tablejump): Add a new parameter.
>        * stmt.c (case_node): Add new fields COUNT and SUBTREE_COUNT.
>        (add_case_node): Pass execution count of the case node and use
>        it to initialize COUNT field.
>        (case_probability): New macro.
>        (expand_case): Propagate execution counts to generated
>        branches using REG_BR_PROB notes.
>        (emit_case_nodes): Likewise.
>        (do_jump_if_equal): Pass probability for REG_BR_PROB note.
>        (compute_subtree_counts): New function to compute
>        SUBTREE_COUNT fields of case nodes.
>        (add_prob_note_to_last_insn): Add a REG_BR_PROB note with the
>        given probability to the last generated instruction.
>
> gcc/testsuite/ChangeLog:
> 2012-03-23   Easwaran Raman  
>        * gcc.dg/tree-prof/switch-case-1.c: New test case.
>        * gcc.dg/tree-prof/switch-case-2.c: New test case.
>
> diff --git a/gcc/cfgbuild.c b/gcc/cfgbuild.c
> index 692fea8..d75fbda 100644
> --- a/gcc/cfgbuild.c
> +++ b/gcc/cfgbuild.c
> @@ -534,6 +534,21 @@ find_bb_boundaries (basic_block bb)
>     purge_dead_tablejump_edges (bb, table);
>  }
>
> +/* Check if there is at least one edge in EDGES with a non-zero count
> +   field.  */
> +
> +static bool
> +non_zero_profile_counts ( VEC(edge,gc) *edges) {
> +  edge e;
> +  edge_iterator ei;
> +  FOR_EACH_EDGE(e, ei, edges)
> +    {
> +      if (e->count > 0)
> +        return true;
> +    }
> +  return false;
> +}
> +
>  /*  Assume that frequency of basic block B is known.  Compute frequencies
>     and probabilities of outgoing edges.  */
>
> @@ -569,6 +584,10 @@ compute_outgoing_frequencies (basic_block b)
>       e->count = b->count;
>       return;
>     }
> +  else if (non_zero_profile_counts (b->succs)){
> +    /*Profile counts already set, but REG_NOTE missing. Retain the counts.  
> */
> +    return;
> +  }
>   guess_outgoing_edge_probabilities (b);
>   if (b->count)
>     FOR_EACH_EDGE (e, ei, b->succs)
> diff --git a/gcc/expr.c b/gcc/expr.c
> index f9de908..fb8eef9 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -156,7 +156,7 @@ static rtx do_store_flag (sepops, rtx, enum machine_mode);
>  #ifdef PUSH_ROUNDING
>  static void emit_single_push_insn (enum machine_mode, rtx, tree);
>  #endif
> -static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx);
> +static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx, int);
>  static rtx const_vector_from_tree (tree);
>  static void write_complex_part (rtx, rtx, bool);
>
> @@ -10694,11 +10694,13 @@ try_casesi (tree index_type, tree index_expr, tree 
> minval, tree range,
>    TABLE_LABEL is a CODE_LABEL rtx for the table itself.
>
>    DEFAULT_LABEL is a CODE_LABEL rtx to jump to if the
> -   index value is out of range.  */
> +   index value is out of range.
> +   DEFAULT_PROBABILITY is the probability of jumping to the
> +   DEFAULT_LABEL.  */
>
>  static void
>  do_tablejump (rtx index, enum machine_mode mode, rtx range, rtx table_label,
> -             rtx default_label)
> +             rtx default_label, int default_probability)
>  {
>   rtx temp, vector;
>
> @@ -10714,8 +10716,16 @@ do_tablejump (rtx index, enum machine_mode mode, rtx 
> range, rtx table_label,
>      the maximum value of the range.  */
>
>   if (default_label)
> -    emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
> -                            default_label);
> +    {
> +      emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
> +                              default_label);
> +      if (default_probability != -1)
> +        {
> +          rtx jump_insn = get_last_insn();
> +          add_reg_note (jump_insn, REG_BR_PROB, GEN_INT 
> (default_probability));
> +        }
> +    }
> +
>
>   /* If index is in range, it must fit in Pmode.

Propagate profile counts after switch case expansion (issue5896043)

2012-03-23 Thread Easwaran Raman
This patch propagates execution count of thee case labels of a
switch-case statement after its expansion. Bootstraps and all
tests pass. OK for trunk?

2012-03-23   Easwaran Raman  

* cfgbuild.c (non_zero_profile_counts): New function.
(compute_outgoing_frequencies): If at least one successor of a
BB has non-zero profile count, retain the counts.
* expr.c (do_tablejump): Add a REG_BR_PROB note on the
jump to default label.
(try_tablejump): Add a parameter to specify the probability
of jumping to the default label.
* expr.h (try_tablejump): Add a new parameter.
* stmt.c (case_node): Add new fields COUNT and SUBTREE_COUNT.
(add_case_node): Pass execution count of the case node and use
it to initialize COUNT field.
(case_probability): New macro.
(expand_case): Propagate execution counts to generated
branches using REG_BR_PROB notes.
(emit_case_nodes): Likewise.
(do_jump_if_equal): Pass probability for REG_BR_PROB note.
(compute_subtree_counts): New function to compute
SUBTREE_COUNT fields of case nodes.
(add_prob_note_to_last_insn): Add a REG_BR_PROB note with the
given probability to the last generated instruction.

gcc/testsuite/ChangeLog:
2012-03-23   Easwaran Raman  
* gcc.dg/tree-prof/switch-case-1.c: New test case.
* gcc.dg/tree-prof/switch-case-2.c: New test case.

diff --git a/gcc/cfgbuild.c b/gcc/cfgbuild.c
index 692fea8..d75fbda 100644
--- a/gcc/cfgbuild.c
+++ b/gcc/cfgbuild.c
@@ -534,6 +534,21 @@ find_bb_boundaries (basic_block bb)
 purge_dead_tablejump_edges (bb, table);
 }
 
+/* Check if there is at least one edge in EDGES with a non-zero count
+   field.  */
+
+static bool
+non_zero_profile_counts ( VEC(edge,gc) *edges) {
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE(e, ei, edges)
+{
+  if (e->count > 0)
+return true;
+}
+  return false;
+}
+
 /*  Assume that frequency of basic block B is known.  Compute frequencies
 and probabilities of outgoing edges.  */
 
@@ -569,6 +584,10 @@ compute_outgoing_frequencies (basic_block b)
   e->count = b->count;
   return;
 }
+  else if (non_zero_profile_counts (b->succs)){
+/*Profile counts already set, but REG_NOTE missing. Retain the counts.  */
+return;
+  }
   guess_outgoing_edge_probabilities (b);
   if (b->count)
 FOR_EACH_EDGE (e, ei, b->succs)
diff --git a/gcc/expr.c b/gcc/expr.c
index f9de908..fb8eef9 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -156,7 +156,7 @@ static rtx do_store_flag (sepops, rtx, enum machine_mode);
 #ifdef PUSH_ROUNDING
 static void emit_single_push_insn (enum machine_mode, rtx, tree);
 #endif
-static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx);
+static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx, int);
 static rtx const_vector_from_tree (tree);
 static void write_complex_part (rtx, rtx, bool);
 
@@ -10694,11 +10694,13 @@ try_casesi (tree index_type, tree index_expr, tree 
minval, tree range,
TABLE_LABEL is a CODE_LABEL rtx for the table itself.
 
DEFAULT_LABEL is a CODE_LABEL rtx to jump to if the
-   index value is out of range.  */
+   index value is out of range.
+   DEFAULT_PROBABILITY is the probability of jumping to the
+   DEFAULT_LABEL.  */
 
 static void
 do_tablejump (rtx index, enum machine_mode mode, rtx range, rtx table_label,
- rtx default_label)
+ rtx default_label, int default_probability)
 {
   rtx temp, vector;
 
@@ -10714,8 +10716,16 @@ do_tablejump (rtx index, enum machine_mode mode, rtx 
range, rtx table_label,
  the maximum value of the range.  */
 
   if (default_label)
-emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
-default_label);
+{
+  emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
+  default_label);
+  if (default_probability != -1)
+{
+  rtx jump_insn = get_last_insn();
+  add_reg_note (jump_insn, REG_BR_PROB, GEN_INT (default_probability));
+}
+}
+
 
   /* If index is in range, it must fit in Pmode.
  Convert to Pmode so we can index with it.  */
@@ -10758,7 +10768,7 @@ do_tablejump (rtx index, enum machine_mode mode, rtx 
range, rtx table_label,
 
 int
 try_tablejump (tree index_type, tree index_expr, tree minval, tree range,
-  rtx table_label, rtx default_label)
+  rtx table_label, rtx default_label, int default_probability)
 {
   rtx index;
 
@@ -10776,7 +10786,7 @@ try_tablejump (tree index_type, tree index_expr, tree 
minval, tree range,
   TYPE_MODE (TREE_TYPE (range)),
   expand_normal (range),
   TYPE_UNSIGNED (TREE_TYPE (range))),
-   table_label, default_label);
+   table_label, default_label, default_probability);
   return 1;