On 2010-08-30, at 10:19 PM, Dan Elliott wrote:

> You don't think this will choke on a large (e.g. 10K x 10K) covariance 
> matrix?  

That depends. Is it very close to being rank deficient?That would be my main 
concern. NumPy/LAPACK will have no trouble Cholesky-decomposing a matrix this 
big, presuming the matrix is well-behaved/full rank. Depending on your hardware 
it will be slow, but the Cholesky decomposition will be faster (I think usually 
by about a factor of 2) than other methods that do not assume 
positive-definiteness.

At any rate, inverting the matrix explicitly will take longer and be quite a 
bit worse in terms of the eventual accuracy of your result.

> Given what you know about how it computes the log determinant and how the 
> Cholesky decomposition, do you suppose I would be better off using eigen-
> decomposition to do this since I will also get the determinant from the sum 
> of 
> the logs of the eigenvalues?

You could do it this way, though I am not certain of the stability concerns 
(and I'm in the middle of a move so my copy of Golub and Van Loan is packed up 
somewhere). I know that the SVD is used in numpy.leastsq, so it can't be that 
bad. I've never used covariance matrices quite so big that computing the 
determinant was a significant cost.

One thing to note is that you should definitely not be solving the system for 
every single RHS vector separately. scipy.linalg.solve supports matrix RHSes, 
and will only do the matrix factorization (be it LU or Cholesky) once, 
requiring only the O(n^2) forward/backsubstitution to be done repeatedly. This 
will result in significant savings:

In [5]: X = randn(10000,20000)

In [6]: Y = dot(X, X.T)

In [7]: timeit scipy.linalg.solve(Y, randn(10000), sym_pos=True)
10 loops, best of 3: 78.6 s per loop

In [8]: timeit scipy.linalg.solve(Y, randn(10000, 50), sym_pos=True)
10 loops, best of 3: 81.6 s per loop

David
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to