https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356
--- Comment #5 from Qing Zhao <qing.zhao at oracle dot com> --- the following code in config/aarch64/aarch64.c cause such behavior: 14143 static bool 14144 aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size, 14145 unsigned int align, 14146 enum by_pieces_operation op, 14147 bool speed_p) 14148 { 14149 /* STORE_BY_PIECES can be used when copying a constant string, but 14150 in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR). 14151 For now we always fail this and let the move_by_pieces code copy 14152 the string from read-only memory. */ 14153 if (op == STORE_BY_PIECES) 14154 return false; when deleting line 14153 and 14154. and use this compiler to build the testing case, I got: f: mov w1, 26952 movk w1, 0x21, lsl 16 str w1, [x0] ret looks like exactly we want.