After fixing PR54009 (again), I thought I'd take a look at why reload is generating the following correct but poor code
stw 10,8(1) stw 11 12(1) ... lfd 0,8(1) stfd 0,x+32764@l(9) rather than addi 9,x+32764@l(9) ... stw 10,0(9) stw 11 4(9) This code sequence is from (set (mem/c:DF (lo_sum:SI (reg/f:SI 9) (const:SI (plus:SI (symbol_ref:SI ("x")) (const_int 32764))))) (reg:DF 10) ...gcc.target/powerpc/pr54009.c:42 363 {*movdf_hardfloat32}) In tracing through reload, I see a score of 8 for the m<-d alternative, and 9 for Y<-r. In both cases we have one "loser" operand for a score of 6 ("d" in the first case, "Y" in the second), plus a score of 2 from /* We prefer to reload pseudos over reloading other things, since such reloads may be able to be eliminated later. If we are reloading a SCRATCH, we won't be generating any insns, just using a register, so it is also preferred. So bump REJECT in other cases. Don't do this in the case where we are forcing a constant into memory and it will then win since we don't want to have a different alternative match then. */ if (! (REG_P (operand) && REGNO (operand) >= FIRST_PSEUDO_REGISTER) && GET_CODE (operand) != SCRATCH && ! (const_to_mem && constmemok)) reject += 2; The Y<-r alternative gets one extra from /* Input reloads can be inherited more often than output reloads can be removed, so penalize output reloads. */ if (operand_type[i] != RELOAD_FOR_INPUT && GET_CODE (operand) != SCRATCH) reject++; The problem of course is that the input reload is quite expensive, involving a copy to memory. So, how about teaching reload about this as follows? I picked 6 for the reject value to make it equivalent to a '?' in the constraint, but that may be too large. Any value of 2 or greater works for the testcase. Bootstrapped and regression tested powerpc64-linux, but not yet spec tested. 2013-02-07 Alan Modra <amo...@gmail.com> * reload.c (find_reloads): Disparage reg alternatives needing secondary memory to reload. Index: gcc/reload.c =================================================================== --- gcc/reload.c (revision 195707) +++ gcc/reload.c (working copy) @@ -3633,11 +3633,21 @@ == NO_REGS) reject = 600; - if (operand_type[i] == RELOAD_FOR_OUTPUT - && (targetm.preferred_output_reload_class (operand, - this_alternative[i]) - == NO_REGS)) + else if (operand_type[i] == RELOAD_FOR_OUTPUT + && (targetm.preferred_output_reload_class + (operand, this_alternative[i]) + == NO_REGS)) reject = 600; + +#ifdef SECONDARY_MEMORY_NEEDED + else if (REG_P (operand) + && REGNO (operand) < FIRST_PSEUDO_REGISTER + && (SECONDARY_MEMORY_NEEDED + ((enum reg_class) this_alternative[i], + REGNO_REG_CLASS (REGNO (operand)), + operand_mode[i]))) + reject += 6; +#endif } /* We prefer to reload pseudos over reloading other things, -- Alan Modra Australia Development Lab, IBM