Actually, didn't the original implementation have a couple of bugs? - A[1:n, k] makes a copy, so I'm not sure you were actually swapping elements in the original A - If A[i,j] < 0, you're storing a negative value in themax, making it easy for the next nonnegative value to beat it. You presumably want to store abs(A[i,j]).
See attached (which is also faster, on 0.4). --Tim On Wednesday, March 25, 2015 11:37:07 AM you wrote: > If you want it to look nice and are running on 0.4, just switching to > > slice(A, 1:n, k) ↔ slice(A, 1:n, λ) > > should also get you a performance boost (especially for large matrices). > Obviously you could do even better by devectorizing, but it wouldn't be as > pretty. > > Off-topic, but your use of unicode for this is very elegant, and eye-opening > for me. > > Best, > --Tim > > On Wednesday, March 25, 2015 09:24:09 AM Matt Bauman wrote: > > The swap could be done without temporaries, but I assume you're also > > trying > > to match the look of the pseudocode? > > > > On Wednesday, March 25, 2015 at 11:22:41 AM UTC-4, Jiahao Chen wrote: > > > Here is some code I wrote for completely pivoted LU factorizations. > > > Can you make it even faster? > > > > > > Anyone who can demonstrate verifiable speedups (or find bugs relative > > > to the textbook description) while sticking to pure Julia code wins an > > > acknowledgment in an upcoming paper I'm writing about Julia, and a > > > small token of my appreciation with no cash value. :) > > > > > > Reference: G. H. Golub and C. F. Van Loan, Matrix Computations 4/e, > > > Algorithm 3.4.3, p. 132. > > > > > > Thanks, > > > > > > Jiahao Chen > > > Staff Research Scientist > > > MIT Computer Science and Artificial Intelligence Laboratory
x ↔ y = for i=1:length(x) #Define swap function x[i], y[i] = y[i], x[i] end function idxmaxabs(A, r) r1 = r[1] μ, λ = r1, r1 themax = abs(A[r1, r1]) @inbounds for j in r, i in r a = abs(A[i,j]) if a > themax μ, λ, themax = i, j, A[i, j] end end return μ, λ end function lucompletepiv!(A) n=size(A, 1) rowpiv=zeros(Int, n-1) colpiv=zeros(Int, n-1) for k=1:n-1 μ, λ = idxmaxabs(A, k:n) rowpiv[k] = μ slice(A, k, 1:n) ↔ slice(A, μ, 1:n) colpiv[k] = λ slice(A, 1:n, k) ↔ slice(A, 1:n, λ) if A[k,k] ≠ 0 ρ = k+1:n scale!(1/A[k,k], sub(A, ρ, k)) @inbounds for j in ρ Akj = A[k, j] @simd for i in ρ A[i, j] -= A[i, k] * Akj end end end end return (A, rowpiv, colpiv) end