http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53558
Bug #: 53558 Summary: Register allocator doesn't tie return value register to output variable. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ram...@gcc.gnu.org Given the testcase below GCC generates extra moves to return value registers on ARM v7-a linux-gnueabi . Disabling the smax insn pattern or the "generic" 3 operand condition move to doesn't really help This also appears to be a problem on other architectures given the register allocator fails to merge the copy of the result value in to the result .The *arm_smax_insn pattern looks like the following. Debuggging the register allocator showed that param 0,1, and 2 are in r1, r2 and r0 . However there is probably an easier fit with r0, r2 and r0 given r0 and r2 die as well in this instruction. I might be missing something here. (define_insn "*arm_smax_insn" [(set (match_operand:SI 0 "s_register_operand" "=r,r") (smax:SI (match_operand:SI 1 "s_register_operand" "%0,?r") (match_operand:SI 2 "arm_rhs_operand" "rI,rI"))) (clobber (reg:CC CC_REGNUM))] "TARGET_ARM" "@ cmp\\t%1, %2\;movlt\\t%0, %2 cmp\\t%1, %2\;movge\\t%0, %1\;movlt\\t%0, %2" [(set_attr "conds" "clob") (set_attr "length" "8,12")] ) typedef unsigned int uint32_t; typedef unsigned long long uint64_t; typedef unsigned long uintptr_t; typedef unsigned short uint16_t; void f2(char *d, char const *s, int flags) { uint32_t tmp0, tmp1; if (flags & 1) tmp0 = *s++; if (flags & 2) { uint16_t *ss = (void *)s; tmp1 = *ss++; s = (void *)ss; } if (flags & 1) *d++ = tmp0; if (flags & 2) { uint16_t *dd = (void *)d; *dd++ = tmp1; d = (void *)dd; } }