https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114027
Li Pan <pan2.li at intel dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pan2.li at intel dot com --- Comment #3 from Li Pan <pan2.li at intel dot com> --- Narrow a little compares to the original test case. --------------------------------------------------- int b[10][7] = {{}, // 0 {}, // 1 {}, // 2 {}, // 3 {}, // 4 {}, // 5 {0, 0, 0, 0, 0, 1}, // 6 {2, 3, 4, 5, 6, 7}, // 7 {8, 8, 8, 8, 8, 8}};// 8 //0 1 2 3 4 5 int c; int main() { int d = 0, a = 0; c = 0xFFFFFFFF; for (a = 0; a < 5; a++) { for (d = 0; d < 6; d++) { c ^= -3L; if (b[a + 3][d]) continue; c = 0; } } if (c == -3) { return 0; } else { return 1; } } --------------------------------------------------- The sematics of the loop acts on 5 * 6 matrix. The upstream currently makes the first 4 * 6 vectorized and then goes scalar for the last 6 elements. The vectorized part may looks like below. vect_array.16 = .MASK_LEN_LOAD_LANES (&MEM <int[10][7]> [(void *)&b + 84B], 32B, { -1, ... }, POLY_INT_CST [4, 4], 0); vect__28.17_94 = vect_array.16[0]; vect__28.18_95 = vect_array.16[1]; vect__28.19_96 = vect_array.16[2]; vect__28.20_97 = vect_array.16[3]; vect__28.21_98 = vect_array.16[4]; vect__28.22_99 = vect_array.16[5]; vect_array.16 ={v} {CLOBBER}; mask__70.24_102 = vect__28.17_94 != { 0, ... }; vect_prephitmp_76.25_104 = .VCOND_MASK (mask__70.24_102, { -1, ... }, { -3, ... }); mask__80.26_106 = vect__28.18_95 != { 0, ... }; vect_c_lsm.27_108 = .VCOND_MASK (mask__80.26_106, vect_prephitmp_76.25_104, { 0, ... }); mask__51.28_110 = vect__28.19_96 != { 0, ... }; vect_prephitmp_66.29_112 = .VCOND_MASK (mask__51.28_110, vect_c_lsm.27_108, { -3, ... }); mask__16.30_114 = vect__28.20_97 != { 0, ... }; vect_c_lsm.31_116 = .VCOND_MASK (mask__16.30_114, vect_prephitmp_66.29_112, { 0, ... }); mask__79.32_118 = vect__28.21_98 != { 0, ... }; vect_prephitmp_56.33_120 = .VCOND_MASK (mask__79.32_118, vect_c_lsm.31_116, { -3, ... }); mask__25.34_122 = vect__28.22_99 != { 0, ... }; vect_c_lsm.35_124 = .VCOND_MASK (mask__25.34_122, vect_prephitmp_56.33_120, { 0, ... }); _126 = .REDUC_MAX (vect_c_lsm.35_124); Looks like the last .REDUC_MAX is kind of a surprise here? It is not easy to get the sematics of REDUC_MAX for source code. Actually the c will depend on the previous iteration. For example, if b condition is 0, c will be 0 forever. If b condition is 1, the c will be the sequence similar to [-3, 0, -3, 0...]. Not sure if my understanding is correct, will take a look into tree-vect.