Hi, Following on from the occasional discussion on the list, can I propose a small matrix_rank function for inclusion in numpy/linalg?
I suggest it because it seems rather a basic need for linear algebra, and it's very small and simple... I've appended an implementation with some doctests in the hope that it will be acceptable, Robert - I hope you don't mind me quoting you in the notes. Thanks a lot, Matthew def matrix_rank(M, tol=None): ''' Return rank of matrix using SVD method Rank of the array is the number of SVD singular values of the array that are greater than `tol`. Parameters ---------- M : array-like array of <=2 dimensions tol : {None, float} threshold below which SVD values are considered zero. If `tol` is None, and `S` is an array with singular values for `M`, and `eps` is the epsilon value for datatype of `S`, then `tol` set to ``S.max() * eps``. Examples -------- >>> matrix_rank(np.eye(4)) # Full rank matrix 4 >>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix 4 >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank 0 >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 1 >>> matrix_rank(np.zeros((4,))) 0 >>> matrix_rank([1]) # accepts array-like 1 Notes ----- Golub and van Loan define "numerical rank deficiency" as using tol=eps*S[0] (note that S[0] is the maximum singular value and thus the 2-norm of the matrix). There really is not one definition, much like there isn't a single definition of the norm of a matrix. For example, if your data come from uncertain measurements with uncertainties greater than floating point epsilon, choosing a tolerance of about the uncertainty is probably a better idea (the tolerance may be absolute if the uncertainties are absolute rather than relative, even). When floating point roundoff is your concern, then "numerical rank deficiency" is a better concept, but exactly what the relevant measure of the tolerance is depends on the operations you intend to do with your matrix. [RK, numpy mailing list] References ---------- Matrix Computations by Golub and van Loan ''' M = np.asarray(M) if M.ndim > 2: raise TypeError('array should have 2 or fewer dimensions') if M.ndim < 2: return int(not np.all(M==0)) S = npl.svd(M, compute_uv=False) if tol is None: tol = S.max() * np.finfo(S.dtype).eps return np.sum(S > tol) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion