https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110015

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
169test.c:85:23: note:   vect_is_simple_use: operand max_38 = PHI <max_5(16),
max_40(43)>, type of def: unknown
170test.c:85:23: missed:   Unsupported pattern.
171test.c:62:24: missed:   not vectorized: unsupported use in stmt.
172test.c:85:23: missed:  unexpected pattern.
173test.c:85:23: note:  ***** Analysis  failed with vector mode V8SI
174test.c:85:23: note:  ***** The result for vector mode V32QI would be the
same
175test.c:85:23: missed: couldn't vectorize loop
176test.c:65:13: note: vectorized 0 loops in function.
177Removing basic block 5
178;; basic block 5, loop depth 2
179;;  pred:       16
180;;              43
181# max_38 = PHI <max_5(16), max_40(43)>
182# i_42 = PHI <i_29(16), 0(43)>
183# datap_44 = PHI <datap_30(16), datap_46(43)>
184tmp_24 = *datap_44;
185_35 = tmp_24 < 0;
186_56 = (unsigned int) tmp_24;
187_51 = -_56;
188_1 = (int) _51;
189_25 = MAX_EXPR <_1, max_38>;
190_31 = _1 | -2147483648;
191iftmp.0_27 = (unsigned int) _31;
192.MASK_STORE (datap_44, 8B, _35, iftmp.0_27);
193_26 = MAX_EXPR <tmp_24, max_38>;
194max_5 = _35 ? _25 : _26;
195i_29 = i_42 + 1;
196datap_30 = datap_44 + 4;
197if (w_22 > i_29)
198  goto <bb 16>; [89.00%]
199else
200  goto <bb 9>; [11.00%]
201;;  succ:       16

So here we have a reduction for MAX_EXPR, but there's 2 MAX_EXPR which can be
merge together with MAX_EXPR <max_38, ABS_EXPR <tmp>>

manually change the loop to below, then it can be vectorized.

    for (j = 0; j < t1->h; ++j) {
        const OPJ_UINT32 w = t1->w;
        for (i = 0; i < w; ++i, ++datap) {
            OPJ_INT32 tmp = *datap;
            if (tmp < 0)
              {
                OPJ_UINT32 tmp_unsigned;
                tmp_unsigned = opj_to_smr(tmp);
                memcpy(datap, &tmp_unsigned, sizeof(OPJ_INT32));
                tmp = -tmp;
              }
            max = opj_int_max(max, tmp);
        }
    }

maybe it's related to phiopt?

Reply via email to