https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677
--- Comment #21 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- (In reply to Alex Coplan from comment #13) > The problem seems to be this code in aarch64_process_components: > > while (regno != last_regno) > { > bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno); > machine_mode mode = aarch64_reg_save_mode (regno); > > rtx reg = gen_rtx_REG (mode, regno); > poly_int64 offset = frame.reg_offset[regno]; > if (frame_pointer_needed) > offset -= frame.bytes_below_hard_fp; > > rtx addr = plus_constant (Pmode, ptr_reg, offset); > rtx mem = gen_frame_mem (mode, addr); > > which emits a TFmode mem with offset 512, which is out of range for TFmode > (so we later ICE with an unrecognisable insn). Presumably this just needs > tweaking to emit a new base anchor in the case of large offsets like this. > It looks like the code in aarch64_save_callee_saves already does this. We shouldn't emit new anchor registers here, since unlike in the prologue, we don't have any guarantee that certain registers are free. aarch64_get_separate_components is supposed to vet shrink-wrappable offsets, but in this case the offset looks valid, since: str q22, [sp, #512] is a valid instruction. Perhaps the constraints are too narrow?