[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Andrew Pinski --- It was fixed in GCC 12 by one of the following commits: r12-7142-g83d7e720cd1d07 r12-7141-gbce43c0493f65d r12-7140-g4057266ce5afc1 r12-7138-gaeef5c57f161ad
[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792 --- Comment #3 from Andrew Pinski --- We do produce shrn2 but not shrn now: .L2: ldp q0, q1, [x0] add x0, x0, 32 ushrv0.4s, v0.4s, 3 xtn v0.4h, v0.4s shrn2 v0.8h, v1.4s, 3 str q0, [x1], 16 cmp x0, x2 bne .L2
[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-03-07 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Severity|normal |enhancement --- Comment #2 from Andrew Pinski --- Confirmed. (insn 17 16 18 3 (set (reg:V8HI 109 [ vect__3.8 ]) (vec_concat:V8HI (truncate:V4HI (reg:V4SI 105 [ vect__2.7 ])) (truncate:V4HI (reg:V4SI 107 [ vect__2.7 ] "t9.c":9:16 1942 {vec_pack_trunc_v4si} (expr_list:REG_DEAD (reg:V4SI 107 [ vect__2.7 ]) (expr_list:REG_DEAD (reg:V4SI 105 [ vect__2.7 ]) (nil (insn 18 17 19 3 (set (mem:V8HI (post_inc:DI (reg:DI 92 [ ivtmp.16 ])) [2 MEM [(short unsigned int *)_7]+0 S16 A128]) (reg:V8HI 109 [ vect__3.8 ])) "t9.c":9:16 1161 {*aarch64_simd_movv8hi} (expr_list:REG_DEAD (reg:V8HI 109 [ vect__3.8 ]) (expr_list:REG_INC (reg:DI 92 [ ivtmp.16 ]) (nil Part of the problem is the above. So this might need to be done at the gimple level such that we don't do the vec_concat in the first place That is if we had the RTL for: ushrv1.4s, v1.4s, 3 ushrv0.4s, v0.4s, 3 xtn v2.4h, v1.4s xtn v3.8h, v0.4s str d3, d2, [x1], 16 I think combine would have done its job.
[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792 Richard Biener changed: What|Removed |Added Blocks||53947 --- Comment #1 from Richard Biener --- would need such concept, like a named pattern and a vector pattern recognizing it. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations