On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> wrote: > Hi , > > Based on the conversation in the thread at > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a > pass trying to undo final value replacement going. The initial > implementation was done by Pranav Bhandarkar when he was employed at > Azingo as part of work sponsored by Icera Semiconductor. I've been > trying to get this working with my private port over here. We intend > to contribute this back once our copyright assignments are sorted and > if this will be acceptable to all. I've been getting a few compile > time ICEs with this approach and haven't been able to resolve them > well enough yet. Whilst doing so, I wanted to check on the approach as > outlined below and ask if there's anything that we might have missed > or any problem that one can see with us going along these lines. > Thanks for your time and patience.
Some quick comments. First, do you have a non-pseudo-code testcase that exposes the extra computations? Second, I think rather than trying to undo what SCEV const(!)-prop is doing adjust its cost model (maybe there isn't one) to not create the costly substitutions. Thanks, Richard. > cheers > Ramana > > > > 1) Understanding what scalar evolution does. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Consider the following pseudo code. > > function memcpy (src_pointer, dst_pointer) > src_1 = src_pointer; > dst_1 = dst_pointer; > L1: > *dst_1 = *src_1 (Word copy) > dst++; > src++; <---- Inc by 4 bytes i.e 1 word. > conditional jump to L1 > <fallthrough edge> > /* This is the exit block of loop 1. The following PHI nodes > are added by loopinit pass to convert the SSA form into "closed loop > SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ > > src_2 = PHI <src_1> > dst_2 = PHI <dst_1> > > <fall through edge> > L2: > *dst_2 = *src_2 (Byte Copy) > dst++; > src++; > <conditional jump to L2> > > > Now scalar evolution convertes this into > > Function memcpy (src_pointer, dst_pointer) > src_1 = src_pointer; > dst_1 = dst_pointer; > L1: > *dst_1 = *src_1 (Word copy) > dst++; > src++; <---- Inc by 4 bytes i.e 1 word. > conditional jump to L1 > > <fallthrough edge> > /* This is the exit block of loop 1. The following PHI nodes > are added by loopinit pass to convert the SSA form into "closed loop > SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ > > D.1234_11 = 4 * n (where 'n' is the number of iterations of L1) > src_2 = src_pointer + D.1234_11 > D.1235_22 = 4 * n > dst_2 = dst_pointer + D.1235_22 > > <fallthrough edge> > L2: > *dst_2 = *src_2 (Byte Copy) > dst++; > src++; > <conditional jump to L2> > > > > Therefore a PHI Node is replaced by the final value of src_1 and > dst_1, thus introducing extra computations. > > > 2) How to undo what scalar evolution does. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > To undo what scalar evolution does, we need to record the changes that > scalar evolution makes and then after the loop optimizations are run > we need to replace the PHI nodes that were removed earlier in place of > the computations introduced by scalar evolution. > > A high level description of the process is listed here. (see > tree-scalar-evolution.c and grep for DXP_SPECIFIC) > > Explanation of this sub-pass. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Part 1: Record Final Value replacement related changes. > ~~~~~~ > Final value replacement replaces PHI nodes at the exits of loops with > computations based on the number of iterations of the loop. > For e.g. > <bb 1> > L1: > > x_1 = src + 4 > ... > ... > conditional jump to L1. > <bb 2> (Loop Exit) > x_2 = PHI <x_1> <---- Phi node added by rewrite_into_loop_closed_ssa. > (see the loopinit pass). > > Final Value replacement replaces the PHI node in <bb 2> by > > ssa_temp_var = 4 * no_of_iterations_of_loop > x_2 = src + ssa_temp_var; > > Therefore a PHI node is replaced by computations. > > Recording final value replacement related changes is controlled by the > variable record_scalar_evolution_changes. When set to a non-zero value > the function record_changed_stmts records the changes made. The changes > are recorded in a hashtable changed_stmts_table. The hashtable contains > the stmt added, the PHI node for which this stmt was added and hashcodes > for both the stmt and the phi_node. We also note how many computations > have been added for each of the removed PHI nodes. This is done in a > link list pointed to by phi_nodes_info_head. > > Part 2: Undo Final value replacement related changes > ~~~~~~ > This is the part where the new computations are removed and the PHI nodes > that they are replaced are inserted back in. This replacement is > contingent to a few conditions. > a) All the computations that were added are still present in the basic > block. i.e all the computations are still present in the form in > which they were added and havent been touched by any of the loop > optimizations passes that run between the scalar evolution pass (i.e the > pass when Part 1 is executed) and the 'loopdone' pass. We go > through the exit basic block and look up each stmt in > changed_stmt_table. If found we lookup the corresponding PHI node in the > phi_node_info link list and decrement its count by 1 (count here denotes > the number of computations added. When count it 0 it means all the > computatins added in the scalar evolution pass have been found in the > same form in the loop done pass such a PHI node can be inserted back in > if 'b' is also true). > b) If any PHI node has count zero it can be inserted back and its > corresponding computations removed, iff the argument of the PHI node > still exists as an SSA variable. This means that we can insert > a_1 = PHI <D.10_1> if D.10_1 still exists and hasnt been removed by > any of the passes between the scalar evolution pass and the > loopdone pass. > > > > > 3) Implementation details including the issues in 2. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > 3.1} Recording the Changes made by scalar evolution. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > a) We record changes in tree-scalar-evolution.c in the scalar > evolution pass (see function scev_const_prop). We visit the exit > block of each loop and If we find a PHI node with only one argument > then we set 'record_scalar_evolution_changes' to 1. We also set > 'current_phi_node' to this particular PHI node. Then via a call > to 'record_changed_stmts' from 'force_gimple_operand_bsi' in gimplify.c > we record the new statement added for this PHI node. This > information is recorded in a hash table 'changed_stmts_table'. We > hash on the new statement that is added and store the statement > along with the PHI node for which this statement was added. > > b) Along with this, we also maintain a 'phi_nodes_info' link list > (pointed to by 'phi_nodes_info_head'). Each node in this list > contains a unique PHI node (representing a PHI node that was > removed by scalar evolution) and a 'count' that denotes the number > of statements added for that PHI node. 'record_changed_stmts' calls > 'record_stmt_for_current_phi_node' so that the 'count' for the > 'current_phi_node' is incremented in its corresponding > 'phi_nodes_info' node. Therefore in the above memcpy example. We > would have four entries in 'changed_stmts_table', each containing > the following tuple. > <new_stmt_added, hashvalue, PHI_node, PHI_id,hashvalue_for_phi_node> i.e > the changed_stmts_table would look like > > 1. D.1234_11 = 4 * n , Hashval, src_2 = PHI <src_1>, 1,Hashval_phi_node > 2. src_2 = src_pointer + D.1234_11,Hashval, src_2 = PHI <src_1>, > 1,Hashval_phi_node > 3. D.1235_22 = 4 * n , Hashval, dst_2 = PHI <dst_1>, 2,Hashval_phi_node > 2. dst_2 = dst_pointer + D.1235_22,Hashval, dst_2 = PHI <dst_1>, > 2,Hashval_phi_node > > The phi_nodes_info link list would look like > > __________________________________ ________________________________ > | phi_node = dst_2 = PHI <dst_1> | |phi_node = src_2 = PHI <src_1>| > | count = 2 |--->|count = 2 | > _________________________________ ________________________________ > ^ > | > | > phi_nodes_info_head. > > c) The need for hashvalue_for_phi_node: Since we dont release the > PHI node, sometimes, the PHI node that we have stored itself gets > changed (Trees are finally pointers and so are their > operands). Therefore while inserting we rehash the PHI node and > check the hashvalue with the stored hashvalue of the PHI node. If > it is the same only then we reinsert the PHI node. The hashvalue > for the phi node is generated by 'generate_phi_node_hashcode' > > d) Notes about removing PHI nodes: When -ftree-fvr-undo is enabled, > then we alter slightly the way a PHI node is removed. Each Basic > block contains a chain of PHI nodes. When removing a PHI node say > 'A', 'A' is removed from this chain and also 'released' i.e the SSA > NAMES in 'A' are released for reuse. When -ftree-fvr-undo is > enabled, we remove 'A' from the chain of PHI nodes for the > particular block but we _donot_ 'release' it. This also helps > preserve the immediate uses of PHI node. > > 3.2) Using the recorded information and undoing final value replacement > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > a) This happens in the function 'scev_finalize'. We again visit the > exit block of each loop. In the exit block of each loop we go > through each statement and try to find its corresponding entry in > 'changed_stmts_table'. If found we extract the PHI node that it had > replaced and find the node for this phi node in phi_nodes_info. We > then decrement the 'count' member of this node (denoting that one > statement that was added for this PHI node still exists). If this > count decrements to Zero we should be able to remove these > statemens and reinsert the phi nodes if b is true. This is done by > 'traverse_exit_block_match_instructions' and 'dec_phi_node_count'. > b) We again go through all the statements of the block. For each > statement we extract the PHI node it had replaced from the > 'changed_stmts_table'. We then look up this PHI node in the > phi_nodes_info link list and if the count is zero and if the > argument of the PHI node still exists as a gimple variable we > remove the current statement. If the argument of the PHI node > doesnt exist anymore as a gimple variable (This is possible because > a number of passes run between the time that we record scalar > evolution informations and now when we are undoing final value > replacement) then we just increment the count of the PHI node to > indicate that this PHI node shouldnt be reinserted and the current > statement is not removed. > c) Once we have reached the end of the basic block we simply > travers the phi_nodes_info link list and insert the PHI nodes that > have 'count' zero. b and c are done by > 'check_mark_phi_nodes_insert' and 'remove_stmts_insert_phi_nodes'. > > > 4) Summary of key datastructures. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > a) changed_stmts_table - Hash table to record scalar evolution > changes. > b) phi_nodes_info - Link list, whose each node contains a PHI node > and a count indicating the number of statements that replaced this > phi node as a result of scalar evolution analysis. > > > > -- > Ramana Radhakrishnan >