On Wed, Nov 11, 2015 at 1:22 PM, Alan Hayward <alan.hayw...@arm.com> wrote: > Hi, > I hoped to post this in time for Monday’s cut off date, but circumstances > delayed me until today. Hoping if possible this patch will still be able > to go in. > > > This patch builds upon the change for PR65947, and reduces the amount of > code produced in a vectorized condition reduction where operand 2 of the > COND_EXPR is an assignment of a increasing integer induction variable that > won't wrap. > > > For example (assuming all types are ints), this is a match: > > last = 5; > for (i = 0; i < N; i++) > if (a[i] < min_v) > last = i; > > Whereas, this is not because the result is based off a memory access: > last = 5; > for (i = 0; i < N; i++) > if (a[i] < min_v) > last = a[i]; > > In the integer induction variable case we can just use a MAX reduction and > skip all the code I added in my vectorized condition reduction patch - the > additional induction variables in vectorizable_reduction () and the > additional checks in vect_create_epilog_for_reduction (). From the patch > diff only, it's not immediately obvious that those parts will be skipped > as there is no code changes in those areas. > > The initial value of the induction variable is force set to zero, as any > other value could effect the result of the induction. At the end of the > loop, if the result is zero, then we restore the original initial value.
+static bool +is_integer_induction (gimple *stmt, struct loop *loop) is_nonwrapping_integer_induction? + tree lhs_max = TYPE_MAX_VALUE (TREE_TYPE (gimple_phi_result (stmt))); don't use TYPE_MAX_VALUE. + /* Check that the induction increments. */ + if (tree_int_cst_compare (step, size_zero_node) <= 0) + return false; tree_int_cst_sgn (step) == -1 + /* Check that the max size of the loop will not wrap. */ + + if (! max_loop_iterations (loop, &ni)) + return false; + /* Convert backedges to iterations. */ + ni += 1; just use max_stmt_executions (loop, &ni) which properly checks for overflow of the +1. + max_loop_value = wi::add (wi::to_widest (base), + wi::mul (wi::to_widest (step), ni)); + + if (wi::gtu_p (max_loop_value, wi::to_widest (lhs_max))) + return false; you miss a check for the wi::add / wi::mul to overflow. You can use extra args to determine this. Instead of TYPE_MAX_VALUE use wi::max_value (precision, sign). I wonder if you want to skip all the overflow checks for TYPE_OVERFLOW_UNDEFINED IV types? Thanks, Richard. > > > > Cheers, > Alan. >