https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114027

Li Pan <pan2.li at intel dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pan2.li at intel dot com

--- Comment #3 from Li Pan <pan2.li at intel dot com> ---
Narrow a little compares to the original test case.

---------------------------------------------------
int b[10][7] = {{}, // 0
                {}, // 1
                {}, // 2
                {}, // 3
                {}, // 4
                {}, // 5
                {0, 0, 0, 0, 0, 1}, // 6
                {2, 3, 4, 5, 6, 7}, // 7
                {8, 8, 8, 8, 8, 8}};// 8
               //0  1  2  3  4  5
int c;

int main() {
  int d = 0, a = 0;
  c = 0xFFFFFFFF;

  for (a = 0; a < 5; a++) {
    for (d = 0; d < 6; d++) {
      c ^= -3L;

      if (b[a + 3][d])
        continue;

      c = 0;
    }
  }

  if (c == -3) {
    return 0;
  } else {
    return 1;
  }
}
---------------------------------------------------

The sematics of the loop acts on 5 * 6 matrix. The upstream currently makes the
first 4 * 6 vectorized and then goes scalar for the last 6 elements. The
vectorized part may looks like below.

  vect_array.16 = .MASK_LEN_LOAD_LANES (&MEM <int[10][7]> [(void *)&b + 84B],
32B, { -1, ... }, POLY_INT_CST [4, 4], 0);
  vect__28.17_94 = vect_array.16[0];
  vect__28.18_95 = vect_array.16[1];
  vect__28.19_96 = vect_array.16[2];
  vect__28.20_97 = vect_array.16[3];
  vect__28.21_98 = vect_array.16[4];
  vect__28.22_99 = vect_array.16[5];
  vect_array.16 ={v} {CLOBBER};
  mask__70.24_102 = vect__28.17_94 != { 0, ... };
  vect_prephitmp_76.25_104 = .VCOND_MASK (mask__70.24_102, { -1, ... }, { -3,
... });
  mask__80.26_106 = vect__28.18_95 != { 0, ... };
  vect_c_lsm.27_108 = .VCOND_MASK (mask__80.26_106, vect_prephitmp_76.25_104, {
0, ... });
  mask__51.28_110 = vect__28.19_96 != { 0, ... };
  vect_prephitmp_66.29_112 = .VCOND_MASK (mask__51.28_110, vect_c_lsm.27_108, {
-3, ... });
  mask__16.30_114 = vect__28.20_97 != { 0, ... };
  vect_c_lsm.31_116 = .VCOND_MASK (mask__16.30_114, vect_prephitmp_66.29_112, {
0, ... });
  mask__79.32_118 = vect__28.21_98 != { 0, ... };
  vect_prephitmp_56.33_120 = .VCOND_MASK (mask__79.32_118, vect_c_lsm.31_116, {
-3, ... });
  mask__25.34_122 = vect__28.22_99 != { 0, ... };
  vect_c_lsm.35_124 = .VCOND_MASK (mask__25.34_122, vect_prephitmp_56.33_120, {
0, ... });
  _126 = .REDUC_MAX (vect_c_lsm.35_124);

Looks like the last .REDUC_MAX is kind of a surprise here? It is not easy to
get the sematics of REDUC_MAX for source code.  Actually the c will depend on
the previous iteration.

For example, if b condition is 0, c will be 0 forever. If b condition is 1, the
c will be the sequence similar to [-3, 0, -3, 0...].

Not sure if my understanding is correct, will take a look into tree-vect.

Reply via email to