On Wed, Dec 10, 2014 at 4:22 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
> Richard,
>
> Thanks for your reply!
>
> I didn't understand your point:
>
> Well, I don't mind splitting all critical edges unconditionally
>
> but you do it unconditionally in proposed patch.

I don't mind means I am fine with it.

> Also I assume that
> call of split_critical_edges() can break ssa. For example, we can
> split headers of loops, loop exit blocks etc.

How does that "break SSA"?  You mean loop-closed SSA?  I'd
be surprised if so but that may be possible.

> I prefer to do something
> more loop-specialized, e.g. call edge_split() for critical edges
> outgoing from bb ending with GIMPLE_COND stmt (assuming that edge
> destination bb belongs to loop).

That works for me as well but it is more complicated to implement.
Ideally you'd only split one edge if you find a block with only critical
predecessors (where we'd currently give up).  But note that this
requires re-computation of ifc_bbs in if_convertible_loop_p_1 and it
will change loop->num_nodes so we have to be more careful in
constructing the loop calling if_convertible_bb_p.

Richard.

>
> 2014-12-10 17:31 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>> On Wed, Dec 10, 2014 at 11:54 AM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>> Richard,
>>>
>>> Sorry that I forgot to delete debug dump from my fix.
>>> I have few questions about your comments.
>>>
>>> 1. You wrote :
>>>> You also still have two functions for PHI predication.  And the
>>>> new extended variant doesn't commonize the 2-args and general
>>>> path
>>>  Did you mean that I must combine predicate_scalar_phi and
>>> predicate_extended scalar phi to one function?
>>> Please note that if additional flag was not set up (i.e.
>>> aggressive_if_conv is false) extended predication is required more
>>> compile time since it builds hash_map.
>>
>> It's compile-time complexity is reasonable enough even for
>> non-aggressive if-conversion.
>>
>>> 2. About critical edge splitting.
>>>
>>> Did you mean that we should perform it (1) under aggressive_if_conv
>>> option only; (2) should we split all critical edges.
>>> Note that this leads to recomputing of topological order.
>>
>> Well, I don't mind splitting all critical edges unconditionally, thus
>> do something like
>>
>> Index: gcc/tree-if-conv.c
>> ===================================================================
>> --- gcc/tree-if-conv.c  (revision 218515)
>> +++ gcc/tree-if-conv.c  (working copy)
>> @@ -2235,12 +2235,21 @@ pass_if_conversion::execute (function *f
>>    if (number_of_loops (fun) <= 1)
>>      return 0;
>>
>> +  bool critical_edges_split_p = false;
>>    FOR_EACH_LOOP (loop, 0)
>>      if (flag_tree_loop_if_convert == 1
>>         || flag_tree_loop_if_convert_stores == 1
>>         || ((flag_tree_loop_vectorize || loop->force_vectorize)
>>             && !loop->dont_vectorize))
>> -      todo |= tree_if_conversion (loop);
>> +      {
>> +       if (!critical_edges_split_p)
>> +         {
>> +           split_critical_edges ();
>> +           critical_edges_split_p = true;
>> +           todo |= TODO_cleanup_cfg;
>> +         }
>> +       todo |= tree_if_conversion (loop);
>> +      }
>>
>>  #ifdef ENABLE_CHECKING
>>    {
>>
>>> It is worth noting that in current implementation bb's with 2
>>> predecessors and both are on critical edges are accepted without
>>> additional option.
>>
>> Yes, I know.
>>
>> tree-if-conv.c is a mess right now and if we can avoid adding more
>> to it and even fix the critical edge missed optimization with splitting
>> critical edges then I am all for that solution.
>>
>> Richard.
>>
>>> Thanks ahead.
>>> Yuri.
>>> 2014-12-09 18:20 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>> On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> Here is updated patch2 with the following changes:
>>>>> 1. Delete functions  phi_has_two_different_args and find_insertion_point.
>>>>> 2. Use only one function for extended predication -
>>>>> predicate_extended_scalar_phi.
>>>>> 3. Save gsi before insertion of predicate computations for basic
>>>>> blocks if it has 2 predecessors and
>>>>> both incoming edges are critical or it gas more than 2 predecessors
>>>>> and at least one incoming edge
>>>>> is critical. This saved iterator can be used by extended phi predication.
>>>>>
>>>>> Here is motivated test-case which explains this point.
>>>>> Test-case is attached (t5.c) and it must be compiled with -O2
>>>>> -ftree-loop-vectorize -fopenmp options.
>>>>> The problem phi is in bb-7:
>>>>>
>>>>>   bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 })
>>>>>   {
>>>>>     <bb 5>:
>>>>>     xmax_edge_18 = xmax_edge_36 + 1;
>>>>>     if (xmax_17 == xmax_27)
>>>>>       goto <bb 7>;
>>>>>     else
>>>>>       goto <bb 9>;
>>>>>
>>>>>   }
>>>>>   bb_6 (preds = {bb_4 }, succs = {bb_7 bb_8 })
>>>>>   {
>>>>>     <bb 6>:
>>>>>     if (xmax_17 == xmax_27)
>>>>>       goto <bb 7>;
>>>>>     else
>>>>>       goto <bb 8>;
>>>>>
>>>>>   }
>>>>>   bb_7 (preds = {bb_6 bb_5 }, succs = {bb_11 })
>>>>>   {
>>>>>     <bb 7>:
>>>>>     # xmax_edge_30 = PHI <xmax_edge_36(6), xmax_edge_18(5)>
>>>>>     xmax_edge_19 = xmax_edge_39 + 1;
>>>>>     goto <bb 11>;
>>>>>
>>>>>   }
>>>>>
>>>>> Note that both incoming edges to bb_7 are critical. If we comment out
>>>>> restoring gsi in predicate_all_scalar_phi:
>>>>> #if 0
>>>>>  if ((EDGE_COUNT (bb->preds) == 2 && all_preds_critical_p (bb))
>>>>>      || (EDGE_COUNT (bb->preds) > 2 && has_pred_critical_p (bb)))
>>>>>    gsi = bb_insert_point (bb);
>>>>>  else
>>>>> #endif
>>>>>    gsi = gsi_after_labels (bb);
>>>>>
>>>>> we will get ICE:
>>>>> t5.c: In function 'foo':
>>>>> t5.c:9:6: error: definition in block 4 follows the use
>>>>>  void foo (int n)
>>>>>       ^
>>>>> for SSA_NAME: _1 in statement:
>>>>> _52 = _1 & _3;
>>>>> t5.c:9:6: internal compiler error: verify_ssa failed
>>>>>
>>>>> smce predicate computations were inserted in bb_7.
>>>>
>>>> The issue is obviously that the predicates have already been emitted
>>>> in the target BB - that's of course the wrong place.  This is done
>>>> by insert_gimplified_predicates.
>>>>
>>>> This just shows how edge predicate handling is broken - we don't
>>>> seem to have a sequence of gimplified stmts for edge predicates
>>>> but push those to e->dest which makes this really messy.
>>>>
>>>> Rather than having a separate phase where we insert all
>>>> gimplified bb predicates we should do that on-demand when
>>>> predicating a PHI.
>>>>
>>>> Your patch writes to stderr - that's bad - use dump_file and guard
>>>> the printfs properly.
>>>>
>>>> You also still have two functions for PHI predication.  And the
>>>> new extended variant doesn't commonize the 2-args and general
>>>> paths.
>>>>
>>>> I'm not at all happy with this code.  It may be existing if-conv codes
>>>> fault but making it even worse is not an option.
>>>>
>>>> Again - what's wrong with simply splitting critical edges if
>>>> aggressive_if_conv?  I think that would very much simplify
>>>> things here.  Or alternatively use gsi_insert_on_edge and
>>>> commit edge insertions before merging the blocks.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> ChangeLog is
>>>>>
>>>>> 2014-12-09  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>
>>>>> * tree-if-conv.c : Include hash-map.h.
>>>>> (struct bb_predicate_s): Add new field to save copy of gimple
>>>>> statement iterator.
>>>>> (bb_insert_point): New function.
>>>>> (set_bb_insert_point): New function.
>>>>> (has_pred_critical_p): New function.
>>>>> (if_convertible_bb_p): Allow bb has more than 2 predecessors if
>>>>> AGGRESSIVE_IF_CONV is true.
>>>>> (if_convertible_bb_p): Delete check that bb has at least one
>>>>> non-critical incoming edge.
>>>>> (is_cond_scalar_reduction): Add arguments ARG_0, ARG_1 and EXTENDED.
>>>>> Allow interchange PHI arguments if EXTENDED is false.
>>>>> Change check that block containing reduction statement candidate
>>>>> is predecessor of phi-block since phi may have more than two arguments.
>>>>> (predicate_scalar_phi): Add new arguments for call of
>>>>> is_cond_scalar_reduction.
>>>>> (get_predicate_for_edge): New function.
>>>>> (struct phi_args_hash_traits): New type.
>>>>> (phi_args_hash_traits::hash): New function.
>>>>> (phi_args_hash_traits::equal_keys): New function.
>>>>> (gen_phi_arg_condition): New function.
>>>>> (predicate_extended_scalar_phi): New function.
>>>>> (predicate_all_scalar_phis): Add boolean variable EXTENDED and set it
>>>>> to true if BB containing phi has more than 2 predecessors or both
>>>>> incoming edges are critical. Invoke find_phi_replacement_condition and
>>>>> predicate_scalar_phi if EXTENDED is false. Use saved gsi if BB
>>>>> has 2 predecessors and both incoming edges are critical or it has more
>>>>> than 2 predecessors and atleast one incoming edge is critical.
>>>>> Use standard gsi_after_labels otherwise.
>>>>> Invoke predicate_extended_scalar_phi if EXTENDED is true.
>>>>> (insert_gimplified_predicates): Add bool variable EXTENDED_PREDICATION
>>>>> to save gsi before insertion of predicate computations. SEt-up it to
>>>>> true for BB with 2 predecessors and critical incoming edges either
>>>>>         number of predecessors is geater 2 and at least one incoming edge 
>>>>> is
>>>>> critical.
>>>>> Add check that non-predicated block may have statements to insert.
>>>>> Insert predicate computation of BB just after label if
>>>>> EXTENDED_PREDICATION is true.
>>>>> (tree_if_conversion): Add initialization of AGGRESSIVE_IF_CONV which
>>>>> is copy of inner or outer loop force_vectorize field.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2014-12-04 16:37 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>> On Thu, Dec 4, 2014 at 2:15 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> I did simple change by saving gsi iterator for each bb that has
>>>>>>> critical edges by adding additional field to bb_predicate_s:
>>>>>>>
>>>>>>> typedef struct bb_predicate_s {
>>>>>>>
>>>>>>>   /* The condition under which this basic block is executed.  */
>>>>>>>   tree predicate;
>>>>>>>
>>>>>>>   /* PREDICATE is gimplified, and the sequence of statements is
>>>>>>>      recorded here, in order to avoid the duplication of computations
>>>>>>>      that occur in previous conditions.  See PR44483.  */
>>>>>>>   gimple_seq predicate_gimplified_stmts;
>>>>>>>
>>>>>>>   /* Insertion point for blocks having incoming critical edges.  */
>>>>>>>   gimple_stmt_iterator gsi;
>>>>>>> } *bb_predicate_p;
>>>>>>>
>>>>>>> and this iterator is saved in  insert_gimplified_predicates before
>>>>>>> insertion code for predicate computation. I checked that this fix
>>>>>>> works.
>>>>>>
>>>>>> Huh?  I still wonder what the issue is with inserting everything
>>>>>> after the PHI we predicate.
>>>>>>
>>>>>> Well, your updated patch will come with testcases for the testsuite
>>>>>> that will hopefully fail if doing that.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>>
>>>>>>> Now I am implementing merging of predicate_extended.. and
>>>>>>> predicate_arbitrary.. functions as you proposed.
>>>>>>>
>>>>>>> Best regards.
>>>>>>> Yuri.
>>>>>>>
>>>>>>> 2014-12-04 15:41 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>>>> On Tue, Dec 2, 2014 at 4:28 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>> Thanks Richard for your quick reply!
>>>>>>>>>
>>>>>>>>> 1. I agree that we can combine predicate_extended_ and
>>>>>>>>> predicate_arbitrary_ to one function as you proposed.
>>>>>>>>> 2. What is your opinion about using more simple decision about
>>>>>>>>> insertion point - if bb has use of phi result insert phi predication
>>>>>>>>> before it and at the bb end otherwise. I assume that critical edge
>>>>>>>>> splitting is not a good decision.
>>>>>>>>
>>>>>>>> Why not always insert before the use?  Which would be after labels,
>>>>>>>> what we do for two-arg PHIs.  That is, how can it be that you predicate
>>>>>>>> a PHI in BB1 and then for an edge predicate on one of its incoming
>>>>>>>> edges you get SSA uses with defs that are in BB1 itself?  That
>>>>>>>> can only happen for backedges but those you can't remove in any case.
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards.
>>>>>>>>> Yuri.
>>>>>>>>>
>>>>>>>>> 2014-12-02 16:28 GMT+03:00 Richard Biener 
>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>> On Mon, Dec 1, 2014 at 4:53 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>
>>>>>>>>>>> I resend you patch1 and patch2 with minor changes:
>>>>>>>>>>> 1. I renamed flag_force_vectorize to aggressive_if_conv.
>>>>>>>>>>> 2. Use static cast for the first argument of gimple_phi_arg_edge.
>>>>>>>>>>> I also very sorry that I sent you bad patch.
>>>>>>>>>>>
>>>>>>>>>>> Now let me answer on your questions related to second patch.
>>>>>>>>>>> 1. Why we need both predicate_extended_scalar_phi and
>>>>>>>>>>> predicate_arbitrary_scalar_phi?
>>>>>>>>>>>
>>>>>>>>>>> Let's consider the following simple test-case:
>>>>>>>>>>>
>>>>>>>>>>>   #pragma omp simd safelen(8)
>>>>>>>>>>>   for (i=0; i<512; i++)
>>>>>>>>>>>   {
>>>>>>>>>>>     float t = a[i];
>>>>>>>>>>>     if (t > 0.0f & t < 1.0e+17f)
>>>>>>>>>>>       if (c[i] != 0)  /* c is integer array. */
>>>>>>>>>>> res += 1;
>>>>>>>>>>>   }
>>>>>>>>>>>
>>>>>>>>>>> we can see the following phi node correspondent to res:
>>>>>>>>>>>
>>>>>>>>>>> # res_1 = PHI <res_15(3), res_15(4), res_10(5)>
>>>>>>>>>>>
>>>>>>>>>>> It is clear that we can optimize it to phi node with 2 arguments 
>>>>>>>>>>> only
>>>>>>>>>>> and only one check can be used for phi predication (for reduction in
>>>>>>>>>>> our case), namely predicate of bb_5. In general case we can't do it
>>>>>>>>>>> even if we sort all phi argument values since we still have to 
>>>>>>>>>>> produce
>>>>>>>>>>> a chain of cond expressions to perform phi predication (see comments
>>>>>>>>>>> for predicate_arbitrary_scalar_phi).
>>>>>>>>>>
>>>>>>>>>> How so?  We can always use !(condition) for the "last" value, thus
>>>>>>>>>> treat it as an 'else' case.  That even works for
>>>>>>>>>>
>>>>>>>>>> # res_1 = PHI <res_15(3), res_15(4), res_10(5), res_10(7)>
>>>>>>>>>>
>>>>>>>>>> where the condition for edges 5 and 7 can be computed as
>>>>>>>>>> ! (condition for 3 || condition for 4).
>>>>>>>>>>
>>>>>>>>>> Of course it is worthwhile to also sort single-occurances first
>>>>>>>>>> so your case gets just the condiiton for edge 5 and its inversion
>>>>>>>>>> used for edges 3 and 4 combined.
>>>>>>>>>>
>>>>>>>>>>> 2. Why we need to introduce find_insertion_point?
>>>>>>>>>>>  Let's consider another test-case extracted from 175.vpr ( t5.c is
>>>>>>>>>>> attached) and we can see that bb_7 and bb_9 containig phi nodes has
>>>>>>>>>>> only critical incoming edges and both contain code computing edge
>>>>>>>>>>> predicates, e.g.
>>>>>>>>>>>
>>>>>>>>>>> <bb 7>:
>>>>>>>>>>> # xmax_edge_30 = PHI <xmax_edge_36(6), xmax_edge_18(5)>
>>>>>>>>>>> _46 = xmax_17 == xmax_37;
>>>>>>>>>>> _47 = xmax_17 == xmax_27;
>>>>>>>>>>> _48 = _46 & _47;
>>>>>>>>>>> _53 = xmax_17 == xmax_37;
>>>>>>>>>>> _54 = ~_53;
>>>>>>>>>>> _55 = xmax_17 == xmax_27;
>>>>>>>>>>> _56 = _54 & _55;
>>>>>>>>>>> _57 = _48 | _56;
>>>>>>>>>>> xmax_edge_19 = xmax_edge_39 + 1;
>>>>>>>>>>> goto <bb 11>;
>>>>>>>>>>>
>>>>>>>>>>> It is evident that we can not put phi predication at the block
>>>>>>>>>>> beginning but need to put it after predicate computations.
>>>>>>>>>>> Note also that if there are no critical edges for phi arguments
>>>>>>>>>>> insertion point will be "after labels" Note also that phi result can
>>>>>>>>>>> have use in this block too, so we can't put predication code to the
>>>>>>>>>>> block end.
>>>>>>>>>>
>>>>>>>>>> So the issue is that predicate insertion for edge predicates does
>>>>>>>>>> not happen on the edge but somewhere else (generally impossible
>>>>>>>>>> for critical edges unless you split them).
>>>>>>>>>>
>>>>>>>>>> I think I've told you before that I prefer simple solutions to such 
>>>>>>>>>> issues,
>>>>>>>>>> like splitting the edge!  Certainly not involving a function walking
>>>>>>>>>> GENERIC expressions.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> Let me know if you still have any questions.
>>>>>>>>>>>
>>>>>>>>>>> Best regards.
>>>>>>>>>>> Yuri.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2014-11-28 15:43 GMT+03:00 Richard Biener 
>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>> On Wed, Nov 12, 2014 at 2:35 PM, Yuri Rumyantsev 
>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is the second patch related to extended predication.
>>>>>>>>>>>>> Few comments which explain a main goal of design.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. I don't want to insert any critical edge splitting since it may
>>>>>>>>>>>>> lead to less efficient binaries.
>>>>>>>>>>>>> 2. One special case of extended PHI node predication was 
>>>>>>>>>>>>> introduced
>>>>>>>>>>>>> when #arguments is more than 2 but only two arguments are 
>>>>>>>>>>>>> different
>>>>>>>>>>>>> and one argument has the only occurrence. For such PHI conditional
>>>>>>>>>>>>> scalar reduction is applied.
>>>>>>>>>>>>> This is correspondent to the following statement:
>>>>>>>>>>>>>     if (q1 && q2 && q3) var++
>>>>>>>>>>>>>  New function phi_has_two_different_args was introduced to detect 
>>>>>>>>>>>>> such phi.
>>>>>>>>>>>>> 3. Original algorithm for PHI predication used assumption that at
>>>>>>>>>>>>> least one incoming edge for blocks containing PHI is not critical 
>>>>>>>>>>>>> - it
>>>>>>>>>>>>> guarantees that all computations related to predicate of normal 
>>>>>>>>>>>>> edge
>>>>>>>>>>>>> are already inserted above this block and
>>>>>>>>>>>>> code related to PHI predication can be inserted at the beginning 
>>>>>>>>>>>>> of
>>>>>>>>>>>>> block. But this is not true for critical edges for which predicate
>>>>>>>>>>>>> computations are  in the block where code for phi predication 
>>>>>>>>>>>>> must be
>>>>>>>>>>>>> inserted. So new function find_insertion_point is introduced 
>>>>>>>>>>>>> which is
>>>>>>>>>>>>> simply found out the last statement in block defining predicates
>>>>>>>>>>>>> correspondent to all incoming edges and insert phi predication 
>>>>>>>>>>>>> code
>>>>>>>>>>>>> after it (with some minor exceptions).
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately the patch doesn't apply for me - I get
>>>>>>>>>>>>
>>>>>>>>>>>> patch: **** malformed patch at line 505: @@ -1720,6 +2075,8 @@
>>>>>>>>>>>> predicate_all_scalar_phis (struct loop *loop)
>>>>>>>>>>>>
>>>>>>>>>>>> a few remarks nevertheless.  I don't see how we need both
>>>>>>>>>>>> predicate_extended_scalar_phi and predicate_arbitrary_scalar_phi.
>>>>>>>>>>>> Couldn't we simply sort an array of (edge, value) pairs after value
>>>>>>>>>>>> and handle equal values specially in predicate_extended_scalar_phi?
>>>>>>>>>>>> That would even make PHI <a, a, b, c, c> more optimal.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't understand the need for find_insertion_point.  All SSA 
>>>>>>>>>>>> names
>>>>>>>>>>>> required for the predicates are defined upward - and the complex 
>>>>>>>>>>>> CFG
>>>>>>>>>>>> is squashed to a single basic-block, thus the defs will dominate 
>>>>>>>>>>>> the
>>>>>>>>>>>> inserted code if you insert after labels just like for the other 
>>>>>>>>>>>> case.
>>>>>>>>>>>> Or what am I missing?  ("flattening" of the basic-blocks of course 
>>>>>>>>>>>> needs
>>>>>>>>>>>> to happen in dominator order - but I guess that happens already?)
>>>>>>>>>>>>
>>>>>>>>>>>> I'd like the extended PHI handling to be enablable by a flag even
>>>>>>>>>>>> for !force-vectorization - I've seen cases with 3 PHI args multiple
>>>>>>>>>>>> times that would have been nice to vectorize.  I suggest to
>>>>>>>>>>>> add -ftree-loop-if-convert-aggressive for this.  We can do this as
>>>>>>>>>>>> followup, but please rename the local flag_force_vectorize flag
>>>>>>>>>>>> to something less looking like a flag, like simply 'aggressive'.
>>>>>>>>>>>>
>>>>>>>>>>>> Otherwise patch 2 looks ok to me.
>>>>>>>>>>>>
>>>>>>>>>>>> Richard.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2014-10-24  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> * tree-if-conv.c (ifcvt_can_use_mask_load_store): Use
>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE instead of loop flag.
>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than 2 predecessors if
>>>>>>>>>>>>> FLAG_FORCE_VECTORIZE is true.
>>>>>>>>>>>>> (if_convertible_bb_p): Delete check that bb has at least one
>>>>>>>>>>>>> non-critical incoming edge.
>>>>>>>>>>>>> (phi_has_two_different_args): New function.
>>>>>>>>>>>>> (is_cond_scalar_reduction): Add argument EXTENDED to choose access
>>>>>>>>>>>>> to phi arguments. Invoke phi_has_two_different_args to get phi
>>>>>>>>>>>>> arguments if EXTENDED is true. Change check that block
>>>>>>>>>>>>> containing reduction statement candidate is predecessor
>>>>>>>>>>>>> of phi-block since phi may have more than two arguments.
>>>>>>>>>>>>> (convert_scalar_cond_reduction): Add argument BEFORE to insert
>>>>>>>>>>>>> statement before/after gsi point.
>>>>>>>>>>>>> (predicate_scalar_phi): Add argument false (which means 
>>>>>>>>>>>>> non-extended
>>>>>>>>>>>>> predication) to call of is_cond_scalar_reduction. Add argument
>>>>>>>>>>>>> true (which correspondent to argument BEFORE) to call of
>>>>>>>>>>>>> convert_scalar_cond_reduction.
>>>>>>>>>>>>> (get_predicate_for_edge): New function.
>>>>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>>>>>>>>> (find_insertion_point): New function.
>>>>>>>>>>>>> (predicate_all_scalar_phis): Add two boolean variables EXTENDED 
>>>>>>>>>>>>> and
>>>>>>>>>>>>> BEFORE. Initialize EXTENDED to true if BB containing phi has more
>>>>>>>>>>>>> than 2 predecessors or both incoming edges are critical. Invoke
>>>>>>>>>>>>> find_phi_replacement_condition and predicate_scalar_phi or
>>>>>>>>>>>>> find_insertion_point and predicate_extended_scalar_phi depending 
>>>>>>>>>>>>> on
>>>>>>>>>>>>> EXTENDED value.
>>>>>>>>>>>>> (insert_gimplified_predicates): Add check that non-predicated 
>>>>>>>>>>>>> block
>>>>>>>>>>>>> may have statements to insert. Insert predicate of BB just after 
>>>>>>>>>>>>> label
>>>>>>>>>>>>> if FLAG_FORCE_VECTORIZE is true.
>>>>>>>>>>>>> (tree_if_conversion): Add initialization of FLAG_FORCE_VECTORIZE 
>>>>>>>>>>>>> which
>>>>>>>>>>>>> is copy of inner or outer loop field force_vectorize.

Reply via email to