Re: query regarding adding a pass to undo final value replacement.

Richard Guenther Wed, 01 Oct 2008 07:01:04 -0700

On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> wrote:
> Hi ,
>
> Based on the conversation in the thread at
> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
> pass trying to undo final value replacement going. The initial
> implementation was done by Pranav Bhandarkar when he was employed at
> Azingo as part of work sponsored by Icera Semiconductor.  I've been
> trying to get this working with my private port over here.  We intend
> to contribute this back once our copyright assignments are sorted and
> if this will be acceptable to all. I've been getting a few compile
> time ICEs with this approach and haven't been able to resolve them
> well enough yet. Whilst doing so, I wanted to check on the approach as
> outlined below and ask if there's anything that we might have missed
> or any problem that one can see with us going along these lines.
> Thanks for your time and patience.


Some quick comments.  First, do you have a non-pseudo-code testcase
that exposes the extra computations?  Second, I think rather than
trying to undo what SCEV const(!)-prop is doing adjust its cost
model (maybe there isn't one) to not create the costly substitutions.

Thanks,
Richard.

> cheers
> Ramana
>
>
>
> 1) Understanding what scalar evolution does.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Consider the following pseudo code.
>
> function memcpy (src_pointer, dst_pointer)
> src_1 = src_pointer;
> dst_1 = dst_pointer;
> L1:
> *dst_1 = *src_1  (Word copy)
> dst++;
> src++;    <---- Inc by 4 bytes i.e 1 word.
> conditional jump to L1
> <fallthrough edge>
>  /* This is the exit block of loop 1. The following PHI nodes
>  are added by loopinit pass to convert the SSA form into "closed loop
>  SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */
>
> src_2 = PHI <src_1>
> dst_2 = PHI <dst_1>
>
> <fall through edge>
> L2:
> *dst_2 = *src_2 (Byte Copy)
> dst++;
> src++;
> <conditional jump to L2>
>
>
> Now scalar evolution convertes this into
>
> Function memcpy (src_pointer, dst_pointer)
>  src_1 = src_pointer;
>  dst_1 = dst_pointer;
>  L1:
>  *dst_1 = *src_1  (Word copy)
>  dst++;
>  src++;    <---- Inc by 4 bytes i.e 1 word.
>  conditional jump to L1
>
> <fallthrough edge>
>  /* This is the exit block of loop 1. The following PHI nodes
>  are added by loopinit pass to convert the SSA form into "closed loop
>  SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */
>
>  D.1234_11 = 4 * n (where 'n' is the number of iterations of L1)
>  src_2 = src_pointer + D.1234_11
>  D.1235_22 = 4 * n
>  dst_2 = dst_pointer + D.1235_22
>
>  <fallthrough edge>
>  L2:
>  *dst_2 = *src_2 (Byte Copy)
>  dst++;
>  src++;
>  <conditional jump to L2>
>
>
>
> Therefore a PHI Node is replaced by the final value of src_1 and
> dst_1, thus introducing extra computations.
>
>
> 2) How to undo what scalar evolution does.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> To undo what scalar evolution does, we need to record the changes that
> scalar evolution makes and then after the loop optimizations are run
> we need to replace the PHI nodes that were removed earlier in place of
> the computations introduced by scalar evolution.
>
> A high level description of the process is listed here. (see
> tree-scalar-evolution.c and grep for DXP_SPECIFIC)
>
>   Explanation of this sub-pass.
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>   Part 1: Record Final Value replacement related changes.
>   ~~~~~~
>   Final value replacement replaces PHI nodes at the exits of loops with
>   computations based on the number of iterations of the loop.
>   For e.g.
>   <bb 1>
>   L1:
>
>   x_1 = src + 4
>   ...
>   ...
>   conditional jump to L1.
>   <bb 2> (Loop Exit)
>   x_2 = PHI <x_1>  <---- Phi node added by rewrite_into_loop_closed_ssa.
>                          (see the loopinit pass).
>
>   Final Value replacement replaces the PHI node in <bb 2> by
>
>   ssa_temp_var = 4 * no_of_iterations_of_loop
>   x_2 = src + ssa_temp_var;
>
>   Therefore a PHI node is replaced by computations.
>
>   Recording final value replacement related changes is controlled by the
>   variable record_scalar_evolution_changes. When set to a non-zero value
>   the function record_changed_stmts records the changes made. The changes
>   are recorded in a hashtable changed_stmts_table. The hashtable contains
>   the stmt added, the PHI node for which this stmt was added and hashcodes
>   for both the stmt and the phi_node. We also note how many computations
>   have been added for each of the removed PHI nodes. This is done in a
>   link list pointed to by phi_nodes_info_head.
>
>   Part 2: Undo Final value replacement related changes
>   ~~~~~~
>   This is the part where the new computations are removed and the PHI nodes
>   that they are replaced are inserted back in. This replacement is
>   contingent to a few conditions.
>      a) All the computations that were added are still present in the basic
>         block. i.e all the computations are still present in the form in
>     which they were added and havent been touched by any of the loop
>     optimizations passes that run between the scalar evolution pass (i.e the
>     pass when Part 1 is executed) and the 'loopdone' pass. We go
>     through the exit basic block and look up each stmt in
>     changed_stmt_table. If found we lookup the corresponding PHI node in the
>     phi_node_info link list and decrement its count by 1 (count here denotes
>     the number of computations added. When count it 0 it means all the
>     computatins added in the scalar evolution pass have been found in the
>     same form in the loop done pass such a PHI node can be inserted back in
>     if 'b' is also true).
>      b) If any PHI node has count zero it can be inserted back and its
>         corresponding computations removed, iff the argument of the PHI node
>     still exists as an SSA variable. This means that we can insert
>     a_1 = PHI <D.10_1> if D.10_1 still exists and hasnt been removed by
>     any of the passes between the scalar evolution pass and the
>     loopdone pass.
>
>
>
>
> 3) Implementation details including the issues in 2.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 3.1} Recording the Changes made by scalar evolution.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   a) We record changes in tree-scalar-evolution.c  in the scalar
>   evolution pass (see function scev_const_prop). We visit the exit
>   block of each loop and If we find a PHI node with only one argument
>   then we set 'record_scalar_evolution_changes' to 1. We also set
>   'current_phi_node' to this particular PHI node. Then via a call
>   to 'record_changed_stmts' from 'force_gimple_operand_bsi' in gimplify.c
>   we record the new statement added for this PHI node. This
>   information is recorded in a hash table 'changed_stmts_table'. We
>   hash on the new statement that is added and store the statement
>   along with the PHI node for which this statement was added.
>
>   b) Along with this, we also maintain a 'phi_nodes_info' link list
>   (pointed to by 'phi_nodes_info_head'). Each node in this list
>   contains a unique PHI node (representing a PHI node that was
>   removed by scalar evolution) and a 'count' that denotes the number
>   of statements added for that PHI node. 'record_changed_stmts' calls
>   'record_stmt_for_current_phi_node' so that the 'count' for the
>   'current_phi_node' is incremented in its corresponding
>   'phi_nodes_info' node. Therefore in the above memcpy example. We
>   would have four entries in 'changed_stmts_table', each containing
>   the following tuple.
>   <new_stmt_added, hashvalue, PHI_node, PHI_id,hashvalue_for_phi_node> i.e
>   the changed_stmts_table would look like
>
>    1. D.1234_11 = 4 * n ,  Hashval, src_2 = PHI <src_1>, 1,Hashval_phi_node
>    2. src_2 = src_pointer + D.1234_11,Hashval, src_2 = PHI <src_1>,
> 1,Hashval_phi_node
>    3. D.1235_22 = 4 * n ,  Hashval, dst_2 = PHI <dst_1>, 2,Hashval_phi_node
>    2. dst_2 = dst_pointer + D.1235_22,Hashval, dst_2 = PHI <dst_1>,
> 2,Hashval_phi_node
>
>    The phi_nodes_info link list would look like
>
>  __________________________________     ________________________________
>  |  phi_node = dst_2 = PHI <dst_1> |    |phi_node = src_2 = PHI <src_1>|
>  | count = 2                       |--->|count = 2                     |
>   _________________________________     ________________________________
>          ^
>          |
>          |
>     phi_nodes_info_head.
>
>   c) The need for hashvalue_for_phi_node: Since we dont release the
>   PHI node, sometimes, the PHI node that we have stored itself gets
>   changed (Trees are finally pointers and so are their
>   operands). Therefore while inserting we rehash the PHI node and
>   check the hashvalue with the stored hashvalue of the PHI node. If
>   it is the same only then we reinsert the PHI node. The hashvalue
>   for the phi node is generated by 'generate_phi_node_hashcode'
>
>   d) Notes about removing PHI nodes: When -ftree-fvr-undo is enabled,
>   then we alter slightly the way a PHI node is removed. Each Basic
>   block contains a chain of PHI nodes. When removing a PHI node say
>   'A', 'A' is removed from this chain and also 'released' i.e the SSA
>   NAMES in 'A' are released for reuse. When -ftree-fvr-undo is
>   enabled, we remove 'A' from the chain of PHI nodes for the
>   particular block but we _donot_ 'release' it. This also helps
>   preserve the immediate uses of PHI node.
>
> 3.2) Using the recorded information and undoing final value replacement
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   a) This happens in the function 'scev_finalize'. We again visit the
>   exit block of each loop. In the exit block of each loop we go
>   through each statement and try to find its corresponding entry in
>   'changed_stmts_table'. If found we extract the PHI node that it had
>   replaced and find the node for this phi node in phi_nodes_info. We
>   then decrement the 'count' member of this node (denoting that one
>   statement that was added for this PHI node still exists). If this
>   count decrements to Zero we should be able to remove these
>   statemens and reinsert the phi nodes if b is true. This is done by
>   'traverse_exit_block_match_instructions' and 'dec_phi_node_count'.
>   b) We again go through all the statements of the block. For each
>   statement we extract the PHI node it had replaced from the
>   'changed_stmts_table'. We then look up this PHI node in the
>   phi_nodes_info link list and if the count is zero and if the
>   argument of the PHI node still exists as a gimple variable we
>   remove the current statement. If the argument of the PHI node
>   doesnt exist anymore as a gimple variable (This is possible because
>   a number of passes run between the time that we record scalar
>   evolution informations and now when we are undoing final value
>   replacement) then we just increment the count of the PHI node to
>   indicate that this PHI node shouldnt be reinserted and the current
>   statement is not removed.
>   c) Once we have reached the end of the basic block we simply
>   travers the phi_nodes_info link list and insert the PHI nodes that
>   have 'count' zero. b and c are done by
>   'check_mark_phi_nodes_insert' and 'remove_stmts_insert_phi_nodes'.
>
>
> 4) Summary of key datastructures.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  a) changed_stmts_table - Hash table to record scalar evolution
>  changes.
>  b) phi_nodes_info - Link list, whose each node contains a PHI node
>  and a count indicating the number of statements that replaced this
>  phi node as a result of scalar evolution analysis.
>
>
>
> --
> Ramana Radhakrishnan
>

Re: query regarding adding a pass to undo final value replacement.

Reply via email to