On Thu, May 04, 2023 at 01:54:46PM +0800, liuhongt wrote:
> r14-172-g0368d169492017 use NO_REGS instead of GENERAL_REGS in memory cost
> calculation when preferred register class is unkown.
> +      /* Costs for NO_REGS are used in cost calculation on the
> +        1st pass when the preferred register classes are not
> +        known yet.  In this case we take the best scenario.  */
> 
> It regressed gcc.target/powerpc/dform-3.c which has inline asm explicitly
> put a vector mode into a general register, then create an extra move.
> RA doesn't allocate GENERAL_REGS for it because the backend pattern
> explicitly disparage the alternative (<??r>, r), (??r, Y) which moves
> from GENERAL_REGS/MEM to GENERAL_REGS.

And that is correct and wanted.

> Normally the extra move can be eliminated by pass_reload

Which is completely beside the point: reload is not in the business of
making good code.  Instead, it should make reasonable code when the
good code we attempted to make did not work out.  Think of it like a
last resort register allocation fixup.

> The patch adds a peephole2 to eliminate the extra move.

Nope.  Not okay.  This is not what peepholes are for *at all*, and this
path leads to insanity and millionfold maintenance cost.

Peepholes are *only*, I repeat *only*, for situations where we did a
*correct* cost estimation but due to some target detail we want to
fine-tune the generated code a bit.

If you want a peephole you most likely have a fundamental cost function
problem elsewhere.  Fix *that*, that is actually useful, and won't get
you into hotter water.

> Ok for trunk?

Not okay, no.  Please fix the actual bug?  Revert the previous patch,
for example :-(

> +;; Peephole to catch memory loads to VSX_REG and then moves to GENERAL_REGS.
> +(define_peephole2
> +  [(set (match_operand:VSX_M 0 "vsx_register_operand")
> +     (match_operand:VSX_M 1 "memory_operand"))
> +   (set (match_operand:VSX_M 2 "int_reg_operand")
> +     (match_dup 0))]
> +  "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
> +  && peep2_reg_dead_p (2, operands[0])"
> +  [(set (match_dup 2) (match_dup 1))])

The condition does not make sense, even assuming the peephole does (it
does not).  Why would you care if the compiler is allowed to generate
64-bit insns here?

The formatting is messed up as well.


Segher

Reply via email to