https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66002
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- high up in the profile are functions train() and dot_product(), also ContextMap::mix1 and Mixer::p. But void train(short *t, short *w, int n, int err) { n=(n+7)&-8; for (int i=0; i<n; ++i) { int wt=w[i]+((t[i]*err*2>>16)+1>>1); if (wt<-32768) wt=-32768; if (wt>32767) wt=32767; w[i]=wt; } } seems to be the hottest function. t.c:4:5: note: not vectorized: relevant stmt not supported: prephitmp_61 = _53 <= 65535 ? pretmp_60 : -32768; t.c:4:5: note: bad operation or unsupported loop bound. t.c:1:6: note: vectorized 0 loops in function. <bb 5>: # i_33 = PHI <0(4), i_28(7)> _9 = (long unsigned int) i_33; _10 = _9 * 2; _12 = w_11(D) + _10; _13 = *_12; _14 = (int) _13; _16 = t_15(D) + _10; _17 = *_16; _18 = (int) _17; _20 = _18 * err_19(D); _21 = _20 * 2; _22 = _21 >> 16; _23 = _22 + 1; _24 = _23 >> 1; wt_25 = _14 + _24; pretmp_60 = (short int) wt_25; _31 = (unsigned int) wt_25; _53 = _31 + 32768; prephitmp_61 = _53 <= 65535 ? pretmp_60 : -32768; _32 = _53 <= 65535; _52 = wt_25 < -32768; _51 = _32 | _52; prephitmp_59 = _51 ? prephitmp_61 : 32767; *_12 = prephitmp_59; i_28 = i_33 + 1; if (n_7 > i_28) goto <bb 7>; else goto <bb 6>; <bb 7>: goto <bb 5>;