Martin Maechler <[EMAIL PROTECTED]> writes: > ## gives > ## ATLAS GOTO std > ## boot-Ex 73.38 73.71 73.62 > ## nlme-Ex 31.92 34.18 31.91 > ## mgcv-Ex 29.20 31.69 29.35 > ## MASS-Ex 21.54 20.49 20.29 > ## stats-Ex 17.80 17.69 17.91 > ## lattice-Ex 11.38 11.37 11.05 > ## methods-Ex 6.87 6.53 6.58 > ## base-Ex 5.48 5.28 5.26 > ## graphics-Ex 4.71 4.73 4.70 > ## tools-Ex 3.86 3.66 3.82 > ## cluster-Ex 3.78 3.74 3.65 > ## utils-Ex 2.73 2.60 2.60 > ## p-r-random-tests 2.60 2.58 2.55 > ## survival-Ex 2.48 2.49 2.30 > ## ... > ## .........
OK, I got around to check this on the Opteron240 system and got just about the same + 50% which is expectable given the relative CPU speeds: ATLAS GOTO std boot-Ex 107.63 115.68 105.55 nlme-Ex 55.00 55.28 48.73 mgcv-Ex 36.45 43.02 40.14 MASS-Ex 34.02 35.14 30.81 stats-Ex 27.44 28.12 27.76 lattice-Ex 18.16 19.06 19.05 methods-Ex 9.94 9.86 10.53 base-Ex 8.56 8.70 8.56 graphics-Ex 7.66 7.72 7.43 cluster-Ex 5.69 5.81 5.47 tools-Ex 4.76 4.57 4.81 utils-Ex 4.44 4.37 5.77 demos2 3.88 3.82 3.63 demos 3.71 3.73 3.46 survival-Ex 3.66 3.76 3.61 p-r-random-tests 3.47 3.50 3.47 ... (The system was supposedly idle, but KDE was running on the console so maybe not quite... Also, the odd cron job may have passed by.) So, basically the threaded and optimized BLAS's are NOPs for these suites of standard tasks. The real teeth are not shown until you do get to tasks which need hardcore numerics: Plain, ATLAS, Goto in that order. Invert random 3000x3000 matrix [EMAIL PROTECTED]:~/r-devel> for i in BUILD* ; do (cd $i ; time echo 'set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m))'|bin/R --vanilla -q) ; done > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m)) [1] 251.90 1.14 253.08 0.00 0.00 > real 4m20.967s user 4m19.431s sys 0m1.537s > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m)) [1] 3.86 1.10 27.24 0.00 0.00 > real 0m35.633s user 0m53.442s sys 0m1.711s > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m)) [1] 30.06 1.15 31.76 0.00 0.00 > real 0m39.804s user 0m42.220s sys 0m1.621s (Notice how system.time gets the CPU usage wrong in the threaded cases, worst so for ATLAS. Presumably, it is only counting one process and in the ATLAS case, one that is mostly idle.) So for matrix inversion, ATLAS seems to be a little faster than Goto (at the expense of a higher CPU utilization, mind you: the Goto version appears to be running nearly single-threaded). For matrix multiply, we have Goto as the fastest: [EMAIL PROTECTED]:~/r-devel> for i in BUILD* ; do (cd $i ; time echo 'set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m)'|bin/R --vanilla -q) ; done > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m) [1] 230.20 0.10 230.36 0.00 0.00 > real 3m58.639s user 3m57.857s sys 0m0.455s > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m) [1] 0.34 0.01 16.49 0.00 0.00 > real 0m25.253s user 0m38.809s sys 0m0.535s > set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m) [1] 12.94 0.08 13.06 0.00 0.00 > real 0m21.629s user 0m32.223s sys 0m0.464s -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel