Actually, didn't the original implementation have a couple of bugs?

- A[1:n, k] makes a copy, so I'm not sure you were actually swapping elements 
in the original A
- If A[i,j] < 0, you're storing a negative value in themax, making it easy for 
the next nonnegative value to beat it. You presumably want to store 
abs(A[i,j]).

See attached (which is also faster, on 0.4).

--Tim

On Wednesday, March 25, 2015 11:37:07 AM you wrote:
> If you want it to look nice and are running on 0.4, just switching to
> 
> slice(A, 1:n, k) ↔ slice(A, 1:n, λ)
> 
> should also get you a performance boost (especially for large matrices).
> Obviously you could do even better by devectorizing, but it wouldn't be as
> pretty.
> 
> Off-topic, but your use of unicode for this is very elegant, and eye-opening
> for me.
> 
> Best,
> --Tim
> 
> On Wednesday, March 25, 2015 09:24:09 AM Matt Bauman wrote:
> > The swap could be done without temporaries, but I assume you're also
> > trying
> > to match the look of the pseudocode?
> > 
> > On Wednesday, March 25, 2015 at 11:22:41 AM UTC-4, Jiahao Chen wrote:
> > > Here is some code I wrote for completely pivoted LU factorizations.
> > > Can you make it even faster?
> > > 
> > > Anyone who can demonstrate verifiable speedups (or find bugs relative
> > > to the textbook description) while sticking to pure Julia code wins an
> > > acknowledgment in an upcoming paper I'm writing about Julia, and a
> > > small token of my appreciation with no cash value. :)
> > > 
> > > Reference: G. H. Golub and C. F. Van Loan, Matrix Computations 4/e,
> > > Algorithm 3.4.3, p. 132.
> > > 
> > > Thanks,
> > > 
> > > Jiahao Chen
> > > Staff Research Scientist
> > > MIT Computer Science and Artificial Intelligence Laboratory
x ↔ y = for i=1:length(x) #Define swap function
  x[i], y[i] = y[i], x[i]
end

function idxmaxabs(A, r)
    r1 = r[1]
    μ, λ = r1, r1
    themax = abs(A[r1, r1])
    @inbounds for j in r, i in r
        a = abs(A[i,j])
        if a > themax
            μ, λ, themax = i, j, A[i, j]
        end
    end
    return μ, λ
end

function lucompletepiv!(A)
  n=size(A, 1)
  rowpiv=zeros(Int, n-1)
  colpiv=zeros(Int, n-1)
  for k=1:n-1
    μ, λ = idxmaxabs(A, k:n)
    rowpiv[k] = μ
    slice(A, k, 1:n) ↔ slice(A, μ, 1:n)
    colpiv[k] = λ
    slice(A, 1:n, k) ↔ slice(A, 1:n, λ)
    if A[k,k] ≠ 0
      ρ = k+1:n
      scale!(1/A[k,k], sub(A, ρ, k))
      @inbounds for j in ρ
          Akj = A[k, j]
          @simd for i in ρ
              A[i, j] -= A[i, k] * Akj
          end
      end
    end
  end
  return (A, rowpiv, colpiv)
end

Reply via email to