Re: tooling quality and some random rant

Don Mon, 14 Feb 2011 17:56:00 -0800

Walter Bright wrote:

retard wrote:
 > There are no arch specific optimizations for PIII, Pentium 4, Pentium D,
Core, Core 2, Core i7, Core i7 2600K, and similar kinds of products from
AMD.
The optimal instruction sequences varied dramatically on those earlierprocessors, but not so much at all on the later ones. Reading the latestIntel/AMD instruction set references doesn't even provide thatinformation anymore.
In particular, instruction scheduling no longer seems to matter, exceptfor the Intel Atom, which benefits very much from Pentium styleinstruction scheduling. Ironically, dmc++ is the only available currentcompiler which supports that.

In hand-coded asm, instruction scheduling still gives more than half ofthe same benefit that it used to do. But, it's become ten times moredifficult. You have to use Agner Fog's manuals, not Intel/AMD.


For example:

(1) a common bottleneck on all Intel processors, is that you can onlyread from three registers per cycle, but you can also read from anyregister which has been modified in the last three cycles.

(2) it's important to break dependency chains.

On the BigInt code, instruction scheduling gave a speedup of ~40%.

But still, cache effects are more important than instruction schedulingin 99% of cases.

No mention of auto-vectorization
dmc doesn't do auto-vectorization. I agree that's an issue.

 > or whole program

I looked into that, there's not a lot of oil in that well.
> and instruction level optimizations the very latest GCC and LLVM arenow slowly adopting.
Huh? Every compiler in existence has done, and always has done,instruction level optimizations.
Note: a lot of modern compilers expend tremendous effort optimizingaccess to global variables (often screwing up multithreaded code in theprocess). I've always viewed this as a crock, since modern programmingstyle eschews globals as much as possible.

Re: tooling quality and some random rant

Reply via email to