Hi! The following testcase ICEs (on ARM/RISCV with certain options), because niters analysis computes number of latch executions for the loop as (short unsigned int) (a.0_1 + 255) + 1 > 256 ? ~(short unsigned int) (a.0_1 + 255) : 0 where a.0_1 is unsigned char. This is correct, but given that a.0_1 + 255 is done in unsigned char the condition is never true and so it is actually equivalent to 0, but the folders don't know that. The vectorizer sets LOOP_VINFO_NITERSM1 to that expression and does on with computing LOOP_VINFO_NITERS by fold_build2 PLUS_EXPR of that expression unshared and INTEGER_CST one. In that folding we trigger various optimizations, first it is correctly simplified into (short unsigned int) (a.0_1 + 255) + 1 > 256 ? -(short unsigned int) (a.0_1 + 255) : 1 and next using /* (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when X is unsigned, as when X + 1 overflows, X is -1, so -X == 1. */ into (short unsigned int) (a.0_1 + 255) >= 256 ? -(short unsigned int) (a.0_1 + 255) : 1 and for this the first COND_EXPR argument is folded and figured out to be 0 and so while LOOP_VINFO_NITERSM1 is a complex expression (unknown to be equivalent to 0), LOOP_VINFO_NITERS is INTEGER_CST 1. vect_analyze_loop_costing then uses LOOP_VINFO_NITERS_KNOWN_P (which checks if LOOP_VINFO_NITERS is INTEGER_CST which fits into shwi or something like that) and from that assumes that LOOP_VINFO_NITERSM1 will be INTEGER_CST.
The following patch fixes that by adding verification for that too. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-01-06 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/113210 * tree-vect-loop.cc (vect_analyze_loop_costing): If LOOP_VINFO_NITERSM1 is not INTEGER_CST, don't try to use it. * gcc.c-torture/compile/pr113210.c: New test. --- gcc/tree-vect-loop.cc.jj 2024-01-03 11:51:22.787852547 +0100 +++ gcc/tree-vect-loop.cc 2024-01-05 17:12:06.511512557 +0100 @@ -2264,7 +2264,8 @@ vect_analyze_loop_costing (loop_vec_info epilogue we can also decide whether the main loop leaves us with enough iterations, prefering a smaller vector epilog then also possibly used for the case we skip the vector loop. */ - if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) + && TREE_CODE (LOOP_VINFO_NITERSM1 (loop_vinfo)) == INTEGER_CST) { widest_int scalar_niters = wi::to_widest (LOOP_VINFO_NITERSM1 (loop_vinfo)) + 1; --- gcc/testsuite/gcc.c-torture/compile/pr113210.c.jj 2024-01-05 17:18:29.792257043 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr113210.c 2024-01-05 17:17:57.522699521 +0100 @@ -0,0 +1,13 @@ +/* PR tree-optimization/113210 */ + +unsigned char a, c; +unsigned short b; + +void +foo (void) +{ + c = a + 255; + b = c; + while (++b > 256) + ; +} Jakub