[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

meissner at gcc dot gnu.org via Gcc-bugs Thu, 26 Aug 2021 11:46:56 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059


Michael Meissner <meissner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |meissner at gcc dot gnu.org

--- Comment #19 from Michael Meissner <meissner at gcc dot gnu.org> ---
The main power8 fusion that GCC does is combining:

    addis       rtmp,r0,symbol@hi(r2)
    ld/lbz/lwz  rx,symbol@lo(rtmp)

into:

    addis       rx,symbol@hi(r2)
    ld/lbz/lwz  rx,symbol@lo(rx)

This fusion is listed as one of the fusion types in the power10 documents.  The
fusion type is wideimmediate.  Note, when you are compiling for -mcpu=power10,
this fusion case doesn't often get used because we use PC-relative loads.  But
the machine does support it.

In addition, it combines loads to a traditional floating point register, and
then a move to a traditional Altivec register.   Similarly, it will combine a
move from a traditional Altivec register to a traditional floating point
register, and then a store:

    lfd   fy,32(rx)        xxlor fy,vsrx
    xxlor vsrz,fy,fy       stfd  fy,32(rz)

into:

    li   rtmp,32           li    rtmp,32
    lxdx vsrz,2,rtmp       stxdx vsrx.rz.rtmp

Now on power9 and power10, this sequence is not generated because we have the
lxsd and stxsd instructions (and plxsd/pstxsd in power10).

So I suspect, we may want to move the p8 load fusion case support to fusion.md,
and do it for power10 as well.  Aaron Sawdey may have other thoughts, since he
has been working on the power10 fusion support, and knows more what is actually
implemented in current hardware.

Then for inlining, we may want to exclude p8_fusion and p10_fusion in the
comparison in rs6000_can_inline_p, since these are optimizations that don't
affect the instructions generated.

Note, there were so-called power9 fusion code that was originally in the power9
spec, but was not implemented in the hardware.  I removed support for these in
November 2018.

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

Reply via email to