No, this is quite different issue related to safety of load/stores which are on branched paths, i.e. not always executed and may trap. Note that we don't have such issue for HSW which have masked load/stores.
Yuri. 2014-05-05 11:44 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: > On Wed, Apr 30, 2014 at 5:50 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >> Thanks a lot Richard for you review. >> I did all proposed changes, checked that bootstrap and regression >> testing did not show new failures. >> Updated patch is attached. > > As said, this is ok for checkin. > > Thanks, > Richard. > >> Best regards. >> Yuri. >> >> 2014-04-30 16:40 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>> On Tue, Apr 29, 2014 at 4:29 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>> 2014-04-28 16:16 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>> On Thu, Apr 17, 2014 at 3:09 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> Hi All, >>>>>> >>>>>> We implemented enhancement for if-convert phase to recognize the >>>>>> simplest conditional reduction and to transform it vectorizable form, >>>>>> e.g. statement >>>>>> if (A[i] != 0) num+= 1; will be recognized. >>>>>> A new test-case is also provided. >>>>>> >>>>>> Bootstrapping and regression testing did not show any new failures. >>>>> >>>>> Clever. Can you add a testcase with a non-constant but invariant >>>>> reduction value and one with a variable reduction value as well? >>>>> >>>> [Yuri] >>>> I added another testcase to demonstrate ability of new algorithm, i.e. >>>> it transforms if (a[i] != 0) sum += a[i]; >>>> to unconditional form without check on zero. Note also that any checks >>>> on non-reduction operand were deleted. >>>> >>>>> + if (!(is_cond_scalar_reduction (arg_0, &reduc, &op0, &op1) >>>>> + || is_cond_scalar_reduction (arg_1, &reduc, &op0, &op1))) >>>>> >>>>> Actually one of the args should be defined by a PHI node in the >>>>> loop header and the PHI result should be the PHI arg on the >>>>> latch edge, so I'd pass both PHI args to the predicate and do >>>>> the decision on what the reduction op is there (you do that >>>>> anyway). The pattern matching is somewhat awkward >>>>> >>>> [Yuri] >>>> I changed prototype of 'is_cond_scalar_reduction' and now we have >>>> only one call: >>>> + if (!is_cond_scalar_reduction (phi, &reduc, &op0, &op1)) >>>> >>>>> + /* Consider only conditional reduction. */ >>>>> + bb = gimple_bb (stmt); >>>>> + if (!bb_has_predicate (bb)) >>>>> + return false; >>>>> + if (is_true_predicate (bb_predicate (bb))) >>>>> + return false; >>>>> >>>>> should be replaced by matching the PHI structure >>>>> >>>>> loop-header: >>>>> reduc_1 = PHI <..., reduc_2> >>>>> ... >>>>> if (..) >>>>> reduc_3 = ... >>>>> reduc_2 = PHI <reduc_1, reduc_3> >>>>> >>>> [Yuri] >>>> In fact, I re-wrote this function completely as you proposed using >>>> PHI structure matching. >>>> >>>>> + lhs = gimple_assign_lhs (stmt); >>>>> + if (TREE_CODE (lhs) != SSA_NAME) >>>>> + return false; >>>>> >>>>> always true, in fact lhs == arg. >>>>> >>>> [Yuri] >>>> Fixed. >>>> >>>>> + if (SSA_NAME_VAR (lhs) == NULL) >>>>> + return false; >>>>> >>>> [Yuri] >>>> Deleted. >>>>> no need to check that (or later verify SSA_NAME_VAR equivalency), not >>>>> sure why you think you need that. >>>>> >>>>> + if (!single_imm_use (lhs, &use, &use_stmt)) >>>>> + return false; >>>>> + if (gimple_code (use_stmt) != GIMPLE_PHI) >>>>> + return false; >>>>> >>>>> checking has_single_use (arg) is enough. The above is error-prone >>>>> wrt debug statements. >>>>> >>>> [Yuri] Only proposed check is used: >>>> + if (!has_single_use (lhs)) >>>> + return false; >>>> >>>>> + if (reduction_op == PLUS_EXPR && >>>>> + TREE_CODE (r_op2) == SSA_NAME) >>>>> >>>>> && goes to the next line >>>>> >>>> [Yuri] >>>> Fixed. >>>> >>>>> + if (TREE_CODE (r_op2) != INTEGER_CST && TREE_CODE (r_op2) != REAL_CST) >>>>> + return false; >>>>> >>>>> any reason for this check? The vectorizer can cope with >>>>> loop invariant non-constant values as well (at least). >>>>> >>>> [Yuri] >>>> This checks were deleted, i.e. any operand is acceptable. >>>> >>>>> + /* Right operand is constant, check that left operand is equal to lhs. >>>>> */ >>>>> + if (SSA_NAME_VAR (lhs) != SSA_NAME_VAR (r_op1)) >>>>> + return false; >>>>> >>>>> see above - that looks weird. >>>>> >>>> [Yuri] >>>> This code was deleted. >>>>> Note that I think you may introduce undefined overflow in >>>>> unconditionally executing the increment. So you need to >>>>> make sure to re-write the increment in unsigned arithmetic >>>>> (for integral types, that is). >>>> [Yuri] >>>> I did not catch your point: if we have >>>> if (cond) s += val; >>>> it will be transformed to >>>> s += (cond? val: 0) >>>> which looks like completely equivalent to original stmt. >>> >>> Ah indeed. >>> >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> >>>>> >>>>>> Is it OK for trunk? >>>>>> >>>>>> gcc/ChangeLog: >>>>>> 2014-04-17 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> >>>>>> * tree-if-conv.c (is_cond_scalar_reduction): New function. >>>>>> (convert_scalar_cond_reduction): Likewise. >>>>>> (predicate_scalar_phi): Add recognition and transformation >>>>>> of simple conditioanl reduction to be vectorizable. >>>>>> >>>>>> gcc/testsuite/ChangeLog: >>>>>> 2014-04-17 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> >>>>>> * gcc.dg/cond-reduc.c: New test. >>>> >>>> New patch is added which includes also new test to detect >>>> vectorization of conditional reduction with non-invariant operand. All >>>> remarks found by Richard were fixed. >>>> >>>> Bootstrap and regression testing did not show any new failures. >>>> Is it OK for trunk? >>> >>> Ok with minor stylistic changes: >>> >>> + struct loop *loop = (gimple_bb (phi))->loop_father; >>> >>> no () around the gimple_bb call. >>> >>> + else if (r_op1 != PHI_RESULT (header_phi)) >>> + return false; >>> >>> too many spaces after = >>> >>> + c = fold_build_cond_expr (TREE_TYPE (rhs1), >>> + unshare_expr (cond), >>> + swap? zero: op1, >>> + swap? op1: zero); >>> >>> a space missing before ? >>> >>> + gsi_insert_before (gsi, new_assign, GSI_SAME_STMT); >>> + update_stmt (new_assign); >>> >>> gsi_insert_before already calls update_stmt on new_assign, no >>> reason to do it again. >>> >>> + /* Build rhs for unconditional increment/decrement. */ >>> + rhs = build2 (gimple_assign_rhs_code (reduc), TREE_TYPE (rhs1), op0, >>> tmp); >>> >>> always use fold_build2, not build2. >>> >>> + if (!is_cond_scalar_reduction (phi, &reduc, &op0, &op1)) >>> + /* Build new RHS using selected condition and arguments. */ >>> + rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond), >>> + arg_0, arg_1); >>> + else >>> + /* Convert reduction stmt into vectorizable form. */ >>> + rhs = convert_scalar_cond_reduction (reduc, gsi, cond, op0, op1, >>> + true_bb != gimple_bb (reduc)); >>> >>> now that it's a very simple check please use a positive form, thus >>> >>> if (is_cond_scalar_reduction ...) >>> * Convert reduction stmt into vectorizable form. */ >>> .... >>> else >>> >>> Ok with these changes. >>> >>> Thanks, >>> Richard. >>> >>>> gcc/ChangeLog >>>> 2014-04-29 Yuri Rumyantsev <ysrum...@gmail.com> >>>> >>>> * tree-if-conv.c (is_cond_scalar_reduction): New function. >>>> (convert_scalar_cond_reduction): Likewise. >>>> (predicate_scalar_phi): Add recognition and transformation >>>> of simple conditioanl reduction to be vectorizable. >>>> >>>> gcc/testsuite/ChangeLog: >>>> * gcc.dg/cond-reduc-1.c: New test. >>>> * gcc.dg/cond-reduc-2.c: Likewise.