>Could you provide a reference to these benchmarks, and the compiler settings >they used? Or better yet, even run them? I don't really use fortran myself, >but link some of my C++ code with Fortran libraries that other folks have >done. But I could do with a speedup of factor 2 ...
see the following web-site http://nils.wustl.edu/schiotz/lapack-linux.html (this is a partial quote) PERFORMANCE: I have compared the speed of the level 3 BLAS routines, using a benchmark program I found on Netlib. I ran it using Dave Webers f2c+gcc compiled library, and also using g77 compiled static and dynamic libraries. Here are the results, obtained on a 75MHz Pentium PC from Gateway2000. ------- Speed in Megaflops ------- Routine f2c+gcc g77 g77 (shared) (shared) (static) ================================================================= DSYMM 2.9 5.3 5.6 DSYRK 2.7 5.5 5.6 DSYR2K 3.2 6.0 6.0 DTRMM 2.6 4.6 5.1 DTRSM 2.4 4.1 4.5 ----------------------------------------------------------------- Average 2.76 5.10 5.36 ----------------------------------------------------------------- rel. to f2c+gcc 100% 185% 194% rel. to g77(sh) 54% 100% 105% rel. to g77(st) 51% 95% 100% ================================================================= So I see almost a factor two increase in performance! I also see the expected 5% penalty of the shared libraries, coming from losing one register.