https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Richard Biener from comment #2) > Confirmed. Pattern recog recognizes the widening multiplication but not a > highpart multiplication. That's currently missing. Please note that the following testcase that multiplies short -> int: --cut here-- #define N 2 unsigned short ur[N], ua[N], ub[N]; void mulh (void) { int i; for (i = 0; i < N; i++) ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16; } void mulh_slp (void) { ur[0] = ((unsigned int) ua[0] * ub[0]) >> 16; ur[1] = ((unsigned int) ua[1] * ub[1]) >> 16; } --cut here-- vectorizes with -O2 -fno-vec-cost-model via .MULH: vect__15.6_1 = MEM <vector(2) short unsigned int> [(short unsigned int *)&ua]; vect__17.9_3 = MEM <vector(2) short unsigned int> [(short unsigned int *)&ub]; vect_patt_34.10_5 = .MULH (vect__15.6_1, vect__17.9_3); MEM <vector(2) short unsigned int> [(short unsigned int *)&ur] = vect_patt_34.10_5; and generates expected: movd ua(%rip), %xmm0 movd ub(%rip), %xmm1 pmulhuw %xmm1, %xmm0 movd %xmm0, ur(%rip) in both cases.