Hi, I recently looked into a sequence like
vzero %v0 vlr %v2, %v0 vlr %v3, %v0. Ideally we would like to use vzero for all of these sets in order to not create dependencies. For some instances of this problem I found the offending snippet to be the postreload cse pass. If there is a non hard reg whose value is equivalent to an existing hard reg, it will replace the non hard reg. The costs are only compared if the respective operand is a CONST_INT_P, otherwise we always replace. The comment before says: /* See if REGNO fits this alternative, and set it up as the replacement register if we don't have one for this alternative yet and the operand being replaced is not a cheap CONST_INT. */ Now, in my case we have a CONST_VECTOR consisting of CONST_INTS (zeros). This is obviously no CONST_INT therefore the substitution takes place resulting in a "vlr" instead of a "vzero". Would it not make sense to always compare costs here? Some backends have instructions for loading vector constants and there could also be backends able to load floating point constants directly. For my snippet getting rid of the CONST_INT check suffices because the costs are similar and no replacement happens. Was this originally a shortcut for performance reasons? I thought we were not checking that many alternatives and only locally at this point anymore. Any comments or ideas? Regards Robin -- diff --git a/gcc/postreload.cc b/gcc/postreload.cc index 41f61d326482..934439733d52 100644 --- a/gcc/postreload.cc +++ b/gcc/postreload.cc @@ -558,13 +558,12 @@ reload_cse_simplify_operands (rtx_insn *insn, rtx testreg) if (op_alt_regno[i][j] == -1 && TEST_BIT (preferred, j) && reg_fits_class_p (testreg, rclass, 0, mode) - && (!CONST_INT_P (recog_data.operand[i]) - || (set_src_cost (recog_data.operand[i], mode, - optimize_bb_for_speed_p - (BLOCK_FOR_INSN (insn))) - > set_src_cost (testreg, mode, - optimize_bb_for_speed_p - (BLOCK_FOR_INSN (insn)))))) + && (set_src_cost (recog_data.operand[i], mode, + optimize_bb_for_speed_p + (BLOCK_FOR_INSN (insn))) + > set_src_cost (testreg, mode, + optimize_bb_for_speed_p + (BLOCK_FOR_INSN (insn))))) { alternative_nregs[j]++; op_alt_regno[i][j] = regno;