On Wed, Aug 31, 2005 at 11:15:51PM -0700, Godfrey DiGiorgi wrote:
> 
> The difference is that the 30-50% gains with the automated tools  
> might mean more value for large bodies of code, where the number of  
> code sections that can be optimized to the maximum extent by hand  
> tend to be a lot smaller. As with all optimization strategies, one  
> has to pick and choose very carefully not only what to optimize but  
> how to do the optimization to obtain the maximum benefit and  
> performance per development dollar.

Another point to consider is that those table-driven automated tools
can easily be modified when a new model of the CPU is released, with
a slightly different set of costs, while hand-optimizing can often
be a matter of going back to square one.

My understanding (gleaned from a buddy of mine who's a senior
engineer in the Intel compiler group) is that nowadays a really
good optimising compiler can get within a factor of two or three
of the best hand coding can do, and in most cases efficient use
of CPU cycles is irrelevant; data access time dominates.  But for
certain cases (such as applying filter kernels to a large image)
really good cycle-optimized hand-written code can realize that 3x
over the best a current compiler can do.  If that's where a program
spends most of its time, the perceived benefit is enormous.

For more linear programs, nowadays, it's seldom worth the effort
of hand coding; it takes a really good coder some serious effort
before being able to even match what a good globally-optimising
auto-parallelising compiler can do, and in any case the difference
between taking 125 or 150 microseconds just isn't noticeable.

Reply via email to