On 22/11/2021 12:39, Richard Biener wrote:
+ if (first_loop_vinfo->suggested_unroll_factor > 1)
+ {
+ if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "***** Re-trying analysis with first vector
mode"
+ " %s for epilogue with partial vectors of"
+ " unrolled first loop.\n",
+ GET_MODE_NAME (vector_modes[0]));
+ mode_i = 0;
and the later done check for bigger VF than main loop - why would
we re-start at 0 rather than at the old mode? Maybe we want to
remember the iterator value we started at when arriving at the
main loop mode? So if we analyzed successfully with mode_i == 2,
then sucessfully at mode_i == 4 which suggested an unroll of 2,
re-start at the mode_i we continued after the mode_i == 2
successful analysis? To just consider the "simple" case of
AVX vs SSE it IMHO doesn't make much sense to succeed with
AVX V4DF, succeed with SSE V2DF and figure it's better than V4DF AVX
but get a suggestion of 2 times unroll and then re-try AVX V4DF
just to re-compute that yes, it's worse than SSE V2DF? You
are probably thinking of SVE vs ADVSIMD here but do we need to
start at 0? Adding a comment to the code would be nice.
Thanks,
I was indeed thinking SVE vs Advanced SIMD where we end up having to
compare different vectorization strategies, which will have different
costs depending. The hypothetical case, as in I don't think I've come
across one, is where if we decide to vectorize the main loop for V8QI
and unroll 2x, yielding a VF of 16, we may then want to then use a
predicated VNx16QI epilogue. Though the question here is whether it is
possible for an Advanced SIMD V8QI vectorization to beat V16QI but a SVE
predicated VNx16QI to beat a VNx8QI for the same loop. Might be good to
get Sandiford's opinion on this.
I do think that initially I was more concerned with skipping a VNx8QI
after selecting a V8QI but I just checked and Advanced SIMD modes are
listed before SVE for (among others) this reason.
Regards,
Andre