Hello,

The program at the end of this mail is 10% slower when compiled with GCC 
3.x (both 3.0.4 and 3.1.1 tested, using -O6, on a PPC604e/180MHz) than 
when compiled with 2.95. The reasons are

a) several superfluous "fmr" instructions are generated by gcc3 (which 
aren't generated by gcc 2.95):

         lfs 0,200(11)
         lfs 1,-200(11)
         lfs 10,4(11)
         lfs 8,-4(11)
         fadds 9,0,1
         fadds 7,9,10
         fadds 6,7,8
 >>        fmr 13,6
         fmul 5,13,11
         frsp 4,5
         stfs 4,4(6)


b) the static prediction bit for the loop-branches are wrong (they are 
set to be predicted "not taken). Changing this doesn't really improve 
performance on my test machine, but it should still be fixed of course 
(note that I've only verified this with gcc 3.0.4, I don't have access 
to 3.1.1 currently):

.L46:
         li 5,1
         li 4,50
.L45:
         li 8,8
         slwi 6,4,2
         mtctr 8
         add 7,6,0
         addi 6,7,4
.L65:
         lfs 10,200(6)

...

         cmpwi 0,5,48
         ble- 0,.L45
         addi 3,3,1
         cmpwi 0,3,2999
         ble- 0,.L46


 From the dialect, you can see these tests were performed under Linux. I 
don't have access a Mac OS X machine with gcc 3.x (only with 2.95.2, I'm 
still under 10.1.5 and I don't want to install the beta tools since I'll 
soon upgrade to 10.2), so it's possible neither of these problems shows 
up in the Apple version of gcc3.


Jonas


-------------------------------------
/*
************************************************************************
*  laplace.c:  Solution of Laplace equation with finite differences    *
*                                                                      *
*  taken from: "Projects in Computational Physics" by Landau and Paez  *
*              copyrighted by John Wiley and Sons, New York            *
*                                                                      *
*  written by: students in PH465/565, Computational Physics,           *
*              at Oregon State University                              *
*              code copyrighted by RH Landau                           *
*  supported by: US National Science Foundation, Northwest Alliance    *
*                for Computational Science and Engineering (NACSE),    *
*                US Department of Energy                               *
*                                                                      *
*  UNIX (DEC OSF, IBM AIX): cc laplace.c                               *
*                                                                      *
*  comment: Output data is saved in 3D grid format used by gnuplot     *
************************************************************************
*/
#include <stdio.h>

#define max 50                         /* number of grid points */

main()
{
    float x, p[max][max];
    int i, j, iter, y;
#if 0
    FILE *output;                       /* save data in laplace.dat */
    output = fopen("laplace.dat","w");
#endif

    for(i=0; i<max; i++)                 /* clear the array  */
    {
       for (j=0; j<max; j++) p[i][j] = 0;
    }

    for(i=0; i<max; i++) p[i][0] = 100.0;        /* p[i][0] = 100 V */          

    for(iter=0; iter<3000; iter++)               /* iterations */
    {
       for(i=1; i<(max-1); i++)                  /* x-direction */
       {
          for(j=1; j<(max-1); j++)               /* y-direction */
          {
             p[i][j] = 0.25*(p[i+1][j]+p[i-1][j]+p[i][j+1]+p[i][j-1]);
          }
       }
    }

#if 0
    printf("Start writing data ...\n");
    for (i=0; i<max ; i++)         /* write data gnuplot 3D format */
    {
       for (j=0; j<max; j++)
       {
          fprintf(output, "%f\n",p[i][j]);
       }
       fprintf(output, "\n");     /* empty line for gnuplot */
    }
    printf("data stored in laplace.dat\n");
    fclose(output);
#endif
}

-------------------------------------

Reply via email to