[Bug target/20748] New: -fprefetch-loop-arrays increases run time considerably

uros at kss-loka dot si Mon, 04 Apr 2005 07:27:13 -0700

I was playing with -fprefetch-loop-arrays on pentium4, trying to get some
speed-up with simple operations on arrays. Consider this small testcase:


#define NELEM 10000000
#define NITER 1000

int buf[NELEM];

int main() {
  int i,j;
  int sum = 0;
  double ssum = 0.0;

  for (i = 0; i < NELEM; i++)
    buf[i] = i;

  for (j = 0; j < NITER; j++) {
    for (i = 0; i < NELEM; i++)
      sum += buf[i];
    ssum += sum;
  }

  printf ("%f\n", ssum);

  return 0;
}

gcc -O2 -march=pentium4:

time ./a.out
3347504896.000000

real    0m18.114s
user    0m17.910s
sys     0m0.072s

Using -fprefetch-loop-arrays, the run time increases drastically:
gcc -O2 -march=pentium4 -fprefetch-loop-arrays

time ./a.out
3347504896.000000

real    0m27.678s
user    0m27.611s
sys     0m0.051s

That is, more than 50% performance hit using -fprefetch-loop-arrays on pentium4.
The inner loop looks like:
.L5:
        prefetcht0      384(%eax)
        addl    (%eax), %edx
        addl    $4, %eax
        cmpl    %eax, %ecx
        jne     .L5

Without -fprefetch-loop-arrays, the code for the inner loop is the same (without
prefetch insn, of course). Is there everythin OK with prefetches on P4?

-- 
           Summary: -fprefetch-loop-arrays increases run time considerably
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: uros at kss-loka dot si
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20748

[Bug target/20748] New: -fprefetch-loop-arrays increases run time considerably

Reply via email to