Re: [PATCH] PowerPC merge TD/TF moves
On Thu, Mar 07, 2013 at 08:45:10PM -0500, David Edelsohn wrote: On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch like the previous 2 pages combines the decimal and binary floating point moves, this time for 128-bit floating point. In doing this patch, I discovered that I left out the code in the previous patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct move instructions. So, I added the code in this patch, and also created a test to make sure that direct moves are generated in the future. I also added the reload helper for DDmode to rs6000_vector_reload that was missed in the last patch. This was harmless, since that is only used with an undocumented debug switch. Hopefully sometime in the future, I will scalar floating point to be able to be loaded in the upper 32 VSX registers that are overlaid over the Altivec registers. Like the previous 2 patches, I've bootstrapped this, and ran make check with no regressions. Is it ok to apply when GCC 4.9 opens up? I have one more patch in the insn combination to post, combining movdi on systems with normal floating point and with the power6 direct move instructions. Mike, Which of these sets of patches adjusts and updates rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing the same register file? None of these patches adjust register_move_cost. -- Michael Meissner, IBM Now: M/S 2757, 5 Technology Place Drive, Westford, MA 01886-3141, USA March 20: M/S 2506R, 550 King Street, Littleton, MA 01460, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [PATCH] PowerPC merge TD/TF moves
On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch like the previous 2 pages combines the decimal and binary floating point moves, this time for 128-bit floating point. In doing this patch, I discovered that I left out the code in the previous patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct move instructions. So, I added the code in this patch, and also created a test to make sure that direct moves are generated in the future. I also added the reload helper for DDmode to rs6000_vector_reload that was missed in the last patch. This was harmless, since that is only used with an undocumented debug switch. Hopefully sometime in the future, I will scalar floating point to be able to be loaded in the upper 32 VSX registers that are overlaid over the Altivec registers. Like the previous 2 patches, I've bootstrapped this, and ran make check with no regressions. Is it ok to apply when GCC 4.9 opens up? I have one more patch in the insn combination to post, combining movdi on systems with normal floating point and with the power6 direct move instructions. Mike, Which of these sets of patches adjusts and updates rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing the same register file? Thanks, David
Re: [PATCH] PowerPC merge TD/TF moves
On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch like the previous 2 pages combines the decimal and binary floating point moves, this time for 128-bit floating point. In doing this patch, I discovered that I left out the code in the previous patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct move instructions. So, I added the code in this patch, and also created a test to make sure that direct moves are generated in the future. I also added the reload helper for DDmode to rs6000_vector_reload that was missed in the last patch. This was harmless, since that is only used with an undocumented debug switch. Hopefully sometime in the future, I will scalar floating point to be able to be loaded in the upper 32 VSX registers that are overlaid over the Altivec registers. Like the previous 2 patches, I've bootstrapped this, and ran make check with no regressions. Is it ok to apply when GCC 4.9 opens up? I have one more patch in the insn combination to post, combining movdi on systems with normal floating point and with the power6 direct move instructions. [gcc] 2013-01-30 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg constraint if -mdebug=reg. (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if -mfpgpr. Enable using dd reload support if needed. * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit binary and decimal floating point moves in rs6000.md. (movtd_internal): Likewise. * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and decimal floating point moves. (movtf): Likewise. (movtf_internal): Likewise. (movmode_internal, TDmode/TFmode): Likewise. (movtf_softfloat): Likewise. (movmode_softfloat, TDmode/TFmode): Likewise. [gcc/testsuite] 2013-01-30 Michael Meissner meiss...@linux.vnet.ibm.com * gcc.target/powerpc/mmfpgpr.c: New test. This patch is okay after 4.9 tree opens. Again, please confirm that it works on pre-POWER7 systems. Thanks, David
[PATCH] PowerPC merge TD/TF moves
This patch like the previous 2 pages combines the decimal and binary floating point moves, this time for 128-bit floating point. In doing this patch, I discovered that I left out the code in the previous patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct move instructions. So, I added the code in this patch, and also created a test to make sure that direct moves are generated in the future. I also added the reload helper for DDmode to rs6000_vector_reload that was missed in the last patch. This was harmless, since that is only used with an undocumented debug switch. Hopefully sometime in the future, I will scalar floating point to be able to be loaded in the upper 32 VSX registers that are overlaid over the Altivec registers. Like the previous 2 patches, I've bootstrapped this, and ran make check with no regressions. Is it ok to apply when GCC 4.9 opens up? I have one more patch in the insn combination to post, combining movdi on systems with normal floating point and with the power6 direct move instructions. [gcc] 2013-01-30 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg constraint if -mdebug=reg. (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if -mfpgpr. Enable using dd reload support if needed. * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit binary and decimal floating point moves in rs6000.md. (movtd_internal): Likewise. * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and decimal floating point moves. (movtf): Likewise. (movtf_internal): Likewise. (movmode_internal, TDmode/TFmode): Likewise. (movtf_softfloat): Likewise. (movmode_softfloat, TDmode/TFmode): Likewise. [gcc/testsuite] 2013-01-30 Michael Meissner meiss...@linux.vnet.ibm.com * gcc.target/powerpc/mmfpgpr.c: New test. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 195586) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1737,6 +1737,7 @@ rs6000_debug_reg_global (void) wa reg_class = %s\n wd reg_class = %s\n wf reg_class = %s\n + wg reg_class = %s\n wl reg_class = %s\n ws reg_class = %s\n wx reg_class = %s\n @@ -1748,6 +1749,7 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wa]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wd]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], @@ -2120,6 +2122,9 @@ rs6000_init_hard_regno_mode_ok (bool glo if (TARGET_ALTIVEC) rs6000_constraints[RS6000_CONSTRAINT_v] = ALTIVEC_REGS; + if (TARGET_MFPGPR) +rs6000_constraints[RS6000_CONSTRAINT_wg] = FLOAT_REGS; + if (TARGET_LFIWAX) rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; @@ -2150,6 +2155,8 @@ rs6000_init_hard_regno_mode_ok (bool glo { rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_di_store; rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_di_load; + rs6000_vector_reload[DDmode][0] = CODE_FOR_reload_dd_di_store; + rs6000_vector_reload[DDmode][1] = CODE_FOR_reload_dd_di_load; } } else @@ -2170,6 +2177,8 @@ rs6000_init_hard_regno_mode_ok (bool glo { rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_si_store; rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_si_load; + rs6000_vector_reload[DDmode][0] = CODE_FOR_reload_dd_si_store; + rs6000_vector_reload[DDmode][1] = CODE_FOR_reload_dd_si_load; } } } Index: gcc/config/rs6000/dfp.md === --- gcc/config/rs6000/dfp.md(revision 195590) +++ gcc/config/rs6000/dfp.md(working copy) @@ -144,27 +144,6 @@ (define_insn *nabstd2_fpr fnabs %0,%1 [(set_attr type fp)]) -(define_expand movtd - [(set (match_operand:TD 0 general_operand ) - (match_operand:TD 1 any_operand ))] - TARGET_HARD_FLOAT TARGET_FPRS - { rs6000_emit_move (operands[0], operands[1], TDmode); DONE; }) - -; It's important to list the Y-r and r-Y moves before r-r because -; otherwise reload, given m-r, will try to pick r-r and reload it, -; which doesn't make progress. -(define_insn_and_split