With patch from here: http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00668.html
IVopts begin to create IVs for expressions like &a0[i][j][0].  This may cause
regressions in stack usage and code size (also possibly speed).  Test case:

/* ---8<--- */
enum {N=123};
int a0[N][N][N], a1[N][N][N], a2[N][N][N], a3[N][N][N],
    a4[N][N][N], a5[N][N][N], a6[N][N][N], a7[N][N][N];

int foo() {
  int i, j, k, s = 0;
  for (i = 0; i < N; i++)
    for (j = 0; j < N; j++)
      for (k = 0; k < N; k++) {
      s += a0[i][j][k]; s += a1[i][j][k]; s += a2[i][j][k]; s += a3[i][j][k];
      s += a4[i][j][k]; s += a5[i][j][k]; s += a6[i][j][k]; s += a7[i][j][k];
      }
  return s;
}
/* ---8<--- */

Without the patch, IVopts produce one IV for j loop and 8 IVs for k loop.  With
the patch, IVopts additionally produce 8 IVs for j loop (with 123*4 increment),
4 of which live on stack (on x86-64, -O2).

Creation of IVs that live on stack is likely due to inexact register pressure
estimation in IVopts.

However, it would be nice if IVopts could notice that it's cheaper to take the
final value of inner loop IVs (e.g. &a0[i][j][k]) instead of incrementing IV
holding &a0[i][j][0] by 123*4.  It would decrease register pressure and allow
to generate perfect code for the test case.


-- 
           Summary: Teaching SCEV about ADDR_EXPR causes regression
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43174

Reply via email to