https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #16 from Hongtao.liu <crazylht at gmail dot com> --- I notice 0x5561dc0 _36 * 2 1 times scalar_stmt costs 16 in body 0x5561dc0 _38 * 2 1 times scalar_stmt costs 16 in body 0x5562df0 _36 * 2 1 times vector_stmt costs 16 in body 0x5562df0 _38 * 2 1 times vector_stmt costs 16 in body ix86_multiplication_cost would be called for cost estimation, but in pass_expand, synth_mult will tranform the multiplization to shift. So shift cost should be used in this case, not mult.