On machines like the x86_64/i386 with -msse2 option or powerpc with the -maltivec option that support vector 8-bit/16-bit shift instructions, GCC generates suboptimal code for variable shifts. Rather than generate the native instruction, the compiler converts the vector to V4SI vector, does the shift, and then converts the vector back to V16QI/V8HI mode. I speculate that this is due to the normal binary operator rules being done to bring both sides to the same type. Shifts and rotates are different in that the right hand side is an int type.
-- Summary: Vector short/char shifts generate sub-optimal code Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: meissner at linux dot vnet dot ibm dot com GCC build triplet: x86_64-unknown-linux-gnu, powerpc64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu, powerpc64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu, powerpc64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073