> Am 15.04.2023 um 10:30 schrieb Jakub Jelinek via Gcc-patches
> :
>
> Hi!
>
> The following patch is just a dumb improvement, gets rid of 2 unnecessary
> instructions on both the PR's original testcase and on the two reduced ones,
> both on -mcpu=neoverse-v1 and -mavx512f.
>
> The thing is, if we have args_len (args_len >= 2) unique PHI arguments,
> we need only args_len - 1 COND_EXPRs to expand the PHI, because first
> COND_EXPR can merge 2 unique arguments and all the following ones merge
> another unique argument with the previously merged arguments,
> while the code for mysterious reasons was always emitting args_len
> COND_EXPRs, where the first COND_EXPR merged the first and second unique
> arguments, the second COND_EXPR merged the second unique argument with
> result of merging the first and second unique arguments and the rest was
> already expectable, nth COND_EXPR for n > 2 merged the nth unique argument
> with result of merging the previous unique arguments.
> Now, in my understanding, the bb_predicate for bb's predecessor need to
> form a disjunct set which together creates the successor's bb_predicate,
> so I don't see why we'd need to check all the bb_predicates, if we check
> all but one then when all those other ones are false the last bb_predicate
> is necessarily true. Given that the code attempts to sort argument with
> most occurrences (so likely most complex combined predicate) last, I chose
> not to test that last argument's predicate.
> So e.g. on the testcase from comment 47 in the PR:
> void
> foo (int *f, int d, int e)
> {
> for (int i = 0; i < 1024; i++)
>{
> int a = f[i];
> int t;
> if (a < 0)
>t = 1;
> else if (a < e)
>t = 1 - a * d;
> else
>t = 0;
> f[i] = t;
>}
> }
> we used to emit:
> _7 = a_10 < 0;
> _21 = a_10 >= 0;
> _22 = a_10 < e_11(D);
> _23 = _21 & _22;
> _26 = a_10 >= e_11(D);
> _27 = _21 & _26;
> _ifc__42 = _7 ? 1 : t_13;
> _ifc__43 = _23 ? t_13 : _ifc__42;
> t_6 = _27 ? 0 : _ifc__43;
> while the following patch changes it to:
> _7 = a_10 < 0;
> _21 = a_10 >= 0;
> _22 = a_10 < e_11(D);
> _23 = _21 & _22;
> _ifc__42 = _23 ? t_13 : 0;
> t_6 = _7 ? 1 : _ifc__42;
> which I believe should be sufficient for a PHI <1, t_13, 0>.
>
> I've gathered some statistics and on x86_64-linux and i686-linux
> bootstraps/regtests, this code triggers:
> 92 4 4
>112 2 4
>141 3 4
> 4046 3 3
> (where 2nd number is args_len and 3rd argument EDGE_COUNT (bb->preds)
> and first argument count of those from sort | uniq -c | sort -n).
> In all these cases the patch should squeze one extra COND_EXPR and
> its associated predicate (the latter only if it wasn't used elsewhere).
>
> Incrementally, I think we should try to perform some analysis on which
> predicates depend on inverses of other predicates and if possible try
> to sort the arguments better and omit testing unnecessary predicates.
> So essentially for the above testcase deconstruct it back to:
> _7 = a_10 < 0;
> _22 = a_10 < e_11(D);
> _ifc__42 = _22 ? t_13 : 0;
> t_6 = _7 ? 1 : _ifc__42;
> which is like what this patch produces, but with the & a_10 >= 0 part
> removed, because the last predicate is a_10 < 0 and so testing a_10 >= 0
> on what appears on the false branch doesn't make sense.
> But I'm afraid that will take more work than is doable in stage4 right now.
Agreed.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Yes - thanks for spotting this obvious improvement.
Richard
> 2023-04-15 Jakub Jelinek
>
>PR tree-optimization/109154
>* tree-if-conv.cc (predicate_scalar_phi): For complex PHIs, emit just
>args_len - 1 COND_EXPRs rather than args_len. Formatting fix.
>
> --- gcc/tree-if-conv.cc.jj2023-04-12 08:53:58.264496474 +0200
> +++ gcc/tree-if-conv.cc2023-04-14 21:02:42.403826690 +0200
> @@ -2071,7 +2071,7 @@ predicate_scalar_phi (gphi *phi, gimple_
> }
>
> /* Put element with max number of occurences to the end of ARGS. */
> - if (max_ind != -1 && max_ind +1 != (int) args_len)
> + if (max_ind != -1 && max_ind + 1 != (int) args_len)
> std::swap (args[args_len - 1], args[max_ind]);
>
> /* Handle one special case when number of arguments with different values
> @@ -2116,12 +2116,12 @@ predicate_scalar_phi (gphi *phi, gimple_
> vec *indexes;
> tree type = TREE_TYPE (gimple_phi_result (phi));
> tree lhs;
> - arg1 = args[1];
> - for (i = 0; i < args_len; i++)
> + arg1 = args[args_len - 1];
> + for (i = args_len - 1; i > 0; i--)
>{
> - arg0 = args[i];
> - indexes = phi_arg_map.get (args[i]);
> - if (i != args_len - 1)
> + arg0 = args[i - 1];
> + indexes = phi_arg_map.get (args[i - 1]);
> + if (i != 1)
>lhs = make_temp_ssa_name (type, NULL, "_ifc_");
> else
>lhs = res;
>
>Jakub
>