Dear Doug:

Thanks for pointing out "system.time". I considered using that but didn't because it doesn't work under S-Plus 6.2. I could write my own, but ... .

Regarding Gabor Grothendieck suggestion to use the Sherman-Morrison-Woodbury formula, this can also be used in recursive computations, and often is in Kalman filtering and other applications where BA is of reduced dimensionality.

     Best Wishes,
     spencer graves

Douglas Bates wrote:

Spencer Graves <[EMAIL PROTECTED]> writes:



One can also use "crossprod" AND use "solve" to actually "solve"
the system of linear equations rather than just get the inverse,
which is later multiplied by t(BA)%*%D. However, the difference
seems very small:



Thanks for pointing that out Spencer. I was about to do the same.




set.seed(1)

> n <- 500
> A <- array(rnorm(n^2), dim=c(n,n))
> B <- array(rnorm(n^2), dim=c(n,n))
> C. <- array(rnorm(n^2), dim=c(n,n))
> D <- array(rnorm(n^2), dim=c(n,n))
>
> BA <- B%*%A
>
> start.time <- proc.time()
> A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D
> proc.time()-start.time
[1] 4.75 0.03 5.13 NA NA
>
> start.time <- proc.time()
> A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D))
> proc.time()-start.time
[1] 4.19 0.01 4.49 NA NA



A minor point on the methodology. You can do this in one step as


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))

Also, in R the second and subsequent timings tend to be a bit faster
than the first.  I think this is due to heap storage being allocated
the first time that large chunks of memory are used and not needing to
be allocated for subsequent uses.



system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.78 0.09 0.87 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.71 0.05 0.76 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.79 0.08 0.87 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.72 0.04 0.76 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.52 0.07 0.59 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.53 0.06 0.59 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.56 0.03 0.59 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.54 0.05 0.59 0.00 0.00



> all.equal(A1, A2)
[1] TRUE

This was in R 1.8.1 under Windows 2000 on an IBM Thinkpad T30
with a Mobile Intel Pentium 4-M, 1.8Ghz, 1Gbyte RAM. The same
script under S-Plus 6.2 produced the following elapsed times:
[1] 3.325 0.121 3.815 0.000 0.000



This is using R-devel (to be 1.9.0) on a 2.0 GHz Pentium-4 desktop computer running Linux and with Goto's BLAS. I'm not sure exactly which of the changes from your system are resulting in the much faster execution time but it is definitely not all due to the processor speed. My guess is that most of the gain is due to the optimized BLAS. Goto's BLAS are a big win on a Pentium-4 under Linux. (Thanks to Brian Ripley for modifying the configure script for R to accept --with-blas=-lgoto .)

Corresponding timings on a Athlon XP 2500+ (1.83 GHz) running Linux
with Atlas are



system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 1.29 0.04 1.34 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.88 0.06 0.95 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.79 0.05 0.85 0.00 0.00


system.time(A1 <- A%*%solve(t(BA)%*%BA+C.)%*%BA%*%D)


[1] 0.82 0.04 0.87 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.61 0.06 0.67 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.66 0.02 0.69 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.51 0.10 0.61 0.00 0.00


system.time(A2 <- A%*%solve(crossprod(BA)+C., crossprod(t(BA), D)))


[1] 0.59 0.10 0.71 0.00 0.00

There you can see the faster execution of the second and subsequent
timings.

I completely agree with you that using crossprod and the non-inverse
form of solve, where appropriate, helps. However, one of the best
optimizations for numerical linear algebra calculations is the use of
optimized BLAS. (I will avoid going in to the Linux vs Windows
comparisons :-)



______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to