https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2021-03-07 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Severity|normal |enhancement --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Confirmed. (insn 17 16 18 3 (set (reg:V8HI 109 [ vect__3.8 ]) (vec_concat:V8HI (truncate:V4HI (reg:V4SI 105 [ vect__2.7 ])) (truncate:V4HI (reg:V4SI 107 [ vect__2.7 ])))) "t9.c":9:16 1942 {vec_pack_trunc_v4si} (expr_list:REG_DEAD (reg:V4SI 107 [ vect__2.7 ]) (expr_list:REG_DEAD (reg:V4SI 105 [ vect__2.7 ]) (nil)))) (insn 18 17 19 3 (set (mem:V8HI (post_inc:DI (reg:DI 92 [ ivtmp.16 ])) [2 MEM <vector(8) short unsigned int> [(short unsigned int *)_7]+0 S16 A128]) (reg:V8HI 109 [ vect__3.8 ])) "t9.c":9:16 1161 {*aarch64_simd_movv8hi} (expr_list:REG_DEAD (reg:V8HI 109 [ vect__3.8 ]) (expr_list:REG_INC (reg:DI 92 [ ivtmp.16 ]) (nil)))) Part of the problem is the above. So this might need to be done at the gimple level such that we don't do the vec_concat in the first place .... That is if we had the RTL for: ushr v1.4s, v1.4s, 3 ushr v0.4s, v0.4s, 3 xtn v2.4h, v1.4s xtn v3.8h, v0.4s str d3, d2, [x1], 16 I think combine would have done its job.