Don wrote:
In hand-coded asm, instruction scheduling still gives more than half of
the same benefit that it used to do. But, it's become ten times more
difficult. You have to use Agner Fog's manuals, not Intel/AMD.
For example:
(1) a common bottleneck on all Intel processors, is that you can only
read from three registers per cycle, but you can also read from any
register which has been modified in the last three cycles.
(2) it's important to break dependency chains.
On the BigInt code, instruction scheduling gave a speedup of ~40%.
Wow. I didn't know that. Do any compilers currently schedule this stuff?
Any chance you want to take a look at cgsched.c? I had great success using the
same algorithm for the quite different Pentium and P6 scheduling minutia.