https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #29 from sergey.shalnov at intel dot com --- Richard, Thank you for your latest patch. I would like to clarify the multiple_p() function usage in if() clause. First of all, I assume that architectures with fixed size of HW registers (x86) should go to if(){} branch, but arch with unknown registers size should go to else{}. This is because is_constant() function used. The problem is in multiple_p() function. In the test case we have group_size=16 and const_nunits=8. So, the multiple_p(16, 8) returns 1 and with –march=skylake-avx512 (or –march=znver1) we go to else{} branch which is not correct. I used following change in your patch and test works as I expect. --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -1923,7 +1923,7 @@ vect_analyze_slp_cost_1 (slp_instance instance, slp_tree node, unsigned HOST_WIDE_INT const_nunits; unsigned nelt_limit; if (TYPE_VECTOR_SUBPARTS (vectype).is_constant (&const_nunits) - && ! multiple_p (group_size, const_nunits)) + && multiple_p (group_size, const_nunits)) { num_vects_to_check = ncopies_for_cost * const_nunits / group_size; nelt_limit = const_nunits; -- Other thing here is what we should do if group_size is, for example, 17. In this case (after my changes) wrong else{} branch will be taken. I would propose to change this particular if() in following way. if (TYPE_VECTOR_SUBPARTS (vectype).is_constant (&const_nunits)) { If(multiple_p (group_size, const_nunits)) { num_vects_to_check = ncopies_for_cost * const_nunits / group_size; nelt_limit = const_nunits; } else { ...not clear what we should have here... } } else { num_vects_to_check = 1; nelt_limit = group_size; } Or perhaps multiple_p() should not be here at all? Sergey