https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107667

            Bug ID: 107667
           Summary: IPA: Speculatively reuse existing specializations
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christophm30 at gmail dot com
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

This is a feature request, not a bug report.

GCC already does an excellent job in specializing functions in ipa-cp. Such
specialized functions often result in much faster execution because the
additional information enables additional optimizations (e.g. vectorization).

Different call sites can have different specializations and some call sites
might not get specialized at all. When looking at the not-specialized call
sites, there is one strategy that can be applied: add guards to test if
existing specializations can be reused and if so call them.

Such an optimization has to be built into ipa-cp and collects all specialized
functions, and the constants that are propagated. At the end of the propagation
stage, the call graph is changed to add speculative edges to the specialized
functions with guards that test if the actual arguments match the constants.

To demonstrate the effect, let's consider the following program part:

  func_1()
    myfunc(1)
  func_2()
    myfunc(2)
  func_i(i)
    myfunc(i)

In this case the transformation would do the following:

  func_1()
    myfunc.constprop.1() // myfunc() with arg0 == 1
  func_2()
    myfunc.constprop.2() // myfunc() with arg0 == 2
  func_i(i)
    if (i == 1)
      myfunc.constprop.1() // myfunc() with arg0 == 1
    else if (i == 2)
      myfunc.constprop.2() // myfunc() with arg0 == 2
    else
      myfunc(i)

Similar to `-devirtualize-speculatively`, such an optimization can be gated
using a flag (e.g. `-fipa-guarded-specialization`).

One example where this optimization would trigger is x264 (also part of
CPU2017), where the function pointer `get_ref` is assigned a single time during
startup, and then called multiple times with constant arguments (8 or 16) or
with "unknown" arguments which are actually matching the constants at runtime.
In combination with PR ipa/107666 (which converts the function pointers into
guarded direct calls), this allows propagating the constants into `pixel_avg`,
where (limited as documented in PR 106352) vectorization will be enabled.
  • [Bug ipa/107667] New: IPA: Spec... christophm30 at gmail dot com via Gcc-bugs

Reply via email to