The vec_widen_smult_{hi,lo}_v4si functions are incorrect, in that they generate the pmuludq instruction, which does a 32x32->64 unsigned multiply. For example, multiplying -13 * 15 = gives 64424509245 with the current code, when it should give -195.
The sse5 instructions pmacsdqh and pmacsdql could do this instruction, but not anything in the standard sse2 set. -- Summary: vec_widen_smult_{hi,lo}_v4si generates pmuludq instruction Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gnu at the-meissners dot org GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36224