------- Comment #1 from ubizjak at gmail dot com 2006-11-09 15:36 ------- > The testcases that failed (on assembler error) are two of tests that require > "vect_widen_mult_hi_to_si": > testsuite/gcc.dg/vect/vect-reduc-dot-s16a.c > testsuite/gcc.dg/vect/vect-widen-mult-s16.c > testsuite/gcc.dg/vect/vect-widen-mult-sum.c
But these files can be succesfully vectorized using current (gcc version 4.3.0 20061109) version on i686: gcc -O2 -msse2 -ftree-vectorize -fdump-tree-vect-all vect-widen-mult-sum.c vect-widen-mult-sum.c:16: note: LOOP VECTORIZED. vect-widen-mult-sum.c:12: note: vectorized 1 loops in function. .L8: movdqu (%eax), %xmm4 movdqu (%ecx,%eax), %xmm2 movdqa %xmm4, %xmm1 movdqa %xmm2, %xmm0 pxor %xmm6, %xmm6 pxor %xmm5, %xmm5 pcmpgtw %xmm2, %xmm6 pcmpgtw %xmm4, %xmm5 punpcklwd %xmm6, %xmm0 punpcklwd %xmm5, %xmm1 movdqa %xmm0, %xmm3 psrldq $4, %xmm0 pmuludq %xmm1, %xmm3 psrldq $4, %xmm1 punpckhwd %xmm6, %xmm2 pmuludq %xmm1, %xmm0 punpckhwd %xmm5, %xmm4 pshufd $8, %xmm3, %xmm3 pshufd $8, %xmm0, %xmm0 punpckldq %xmm0, %xmm3 movd -20(%ebp), %xmm0 psrad %xmm0, %xmm3 movdqa %xmm2, %xmm0 psrldq $4, %xmm2 pmuludq %xmm4, %xmm0 psrldq $4, %xmm4 pmuludq %xmm4, %xmm2 addl $1, %edx paddd %xmm7, %xmm3 pshufd $8, %xmm2, %xmm2 pshufd $8, %xmm0, %xmm7 addl $16, %eax punpckldq %xmm2, %xmm7 movd -20(%ebp), %xmm0 cmpl %edx, %ebx psrad %xmm0, %xmm7 paddd %xmm3, %xmm7 ja .L8 movdqa %xmm7, %xmm1 movl -16(%ebp), %esi psrldq $8, %xmm1 paddd %xmm7, %xmm1 cmpl 20(%ebp), %esi movdqa %xmm1, %xmm0 psrldq $4, %xmm0 paddd %xmm1, %xmm0 movd %xmm0, -24(%ebp) movd %xmm0, %edi je .L4 > The missing insns (that should be merged from autovect-branch and debugged): > vec_widen_umult_hi_v8hi > vec_widen_umult_lo_v8hi These patterns _are_ present in gcc version 4.3.0 20061109 (experimental) in sse.md. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29777