On 2/27/23 11:08 AM, Segher Boessenkool wrote:
Hi!

On Mon, Feb 27, 2023 at 09:11:37AM -0600, Pat Haugen wrote:
The define_insns for the modulo operation currently force the target
register
to a distinct reg in preparation for a possible future peephole combining
div/mod. But this can lead to cases of a needless copy being inserted. Fixed
with the following patch.

Have you verified those peepholes still match?

Yes, I verified the peepholes still match and transform the sequence.


Do those peepholes actually improve performance?  On new CPUs?  The code
here says
;; On machines with modulo support, do a combined div/mod the old fashioned
;; method, since the multiply/subtract is faster than doing the mod instruction
;; after a divide.
but that really should not be true: we can do the div and mod in
parallel (except in SMT4 perhaps, which we never schedule for anyway),
so that should always be strictly faster.

Since the modulo insns were introduced in Power9, we're just talking Power9/Power10. On paper, I would agree that separate div/mod could be slightly faster to get the mod result, but if you throw in another independent div or mod in the insn stream then doing the peephole should be a clear win since that 3rd insn can execute in parallel with the initial divide as opposed to waiting for the one of the first div/mod to clear the exclusive stage of the pipe.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mod-no_copy.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-*  } } } */

All files in gcc.target/powerpc/ test for this already.  Just leave off
the target clause here?

+/* { dg-require-effective-target powerpc_p9modulo_ok } */

Leave out this line, because ...

+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */

... the -mcpu= forces it to true always.

Will update.

-Pat


+/* Verify r3 is used as source and target, no copy inserted. */

+/* { dg-final { scan-assembler-not {\mmr\M} } } */

That is probably good enough, yeah, since the test results in only a
handful of insns.


Segher

Reply via email to