http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48077
--- Comment #5 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2011-03-11 21:27:25 UTC --- BTW, I mis-entered the optimization level before. The code generation was at -O2 when the mulhw was expanded into shifts/adds with the default P6 tuning. At -O3 and up, the mulhw is intact. This is all explained by the default tuning model in place during testing, since Power6 had a poorer performing integer multiply than the other machine models.