https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494
--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 28 Sep 2021, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494 > > --- Comment #7 from Hongtao.liu <crazylht at gmail dot com> --- > After supporting v4hi reduce, gimple seems not optimal to convert v8qi to > v8hi. > > 6 vector(4) short int vect__21.36; > 7 vector(4) unsigned short vect__2.31; > 8 int16_t stmp_r_17.17; > 9 vector(8) short int vect__16.15; > 10 int16_t D.2229[8]; > 11 vector(8) short int _50; > 12 vector(8) short int _51; > 13 vector(8) short int _52; > 14 vector(8) short int _53; > 15 vector(8) short int _54; > 16 vector(8) short int _55; > > 18 <bb 2> [local count: 189214783]: > 19 vect__2.31_97 = [vec_unpack_lo_expr] a_90(D); > 20 vect__2.31_98 = [vec_unpack_hi_expr] a_90(D); > 21 vect__21.36_105 = VIEW_CONVERT_EXPR<vector(4) short int>(vect__2.31_97); > 22 vect__21.36_106 = VIEW_CONVERT_EXPR<vector(4) short int>(vect__2.31_98); > 23 MEM <vector(4) short int> [(short int *)&D.2229] = vect__21.36_105; > 24 MEM <vector(4) short int> [(short int *)&D.2229 + 8B] = vect__21.36_106; so the above could possibly use a V8QI -> V8HI conversion, the loop vectorizer isn't good at producing those though. And of course the appropriate conversion optab has to exist. > 25 vect__16.15_47 = MEM <vector(8) short int> [(short int *)&D.2229]; Here's lack of "CSE" - I do have patches somewhere to turn this into vect__16.15_47 = { vect__21.36_105, vect__21.36_106 }; but I'm not sure that's going to be profitable (well, the code as-is will get a STLF hit). There's also store-merging that could instead merge the stores similarly (but then there's no CSE after store-merging so the load would remain).