https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113114
--- Comment #6 from Alex Coplan <acoplan at gcc dot gnu.org> --- Hmm, it's worth noting that the ILP32 case is a bit different, though, in that we have: (rr) call debug (insn->rtl ()) (insn 16 21 19 3 (parallel [ (set (reg:DF 62 v30) (unspec:DF [ (mem:V2x8QI (reg/v/f:DI 0 x0 [orig:123 a ] [123]) [0 +0 S16 A64]) ] UNSPEC_LDP_FST)) (set (reg:DF 63 v31) (unspec:DF [ (mem:V2x8QI (reg/v/f:DI 0 x0 [orig:123 a ] [123]) [0 +0 S16 A64]) ] UNSPEC_LDP_SND)) ]) 88 {*load_pair_8} (nil)) (rr) call debug (trailing_add->rtl ()) (insn 20 18 41 3 (set (reg:SI 0 x0 [orig:118 ivtmp.22 ] [118]) (plus:SI (reg:SI 0 x0 [orig:123 a ] [123]) (const_int 8 [0x8]))) 119 {*addsi3_aarch64} (nil)) i.e. x0 appears as DImode in the load pair addresses but the trailing update is done in SImode, which means we end up not matching when forming the final pattern. I don't think either case is particularly interesting, so I'm leaning towards just bailing out if recog fails in the pass (in which case both of these just become missed-optimizations).