On 12/15/2009 11:12 AM, josef.p...@gmail.com wrote: > On Tue, Dec 15, 2009 at 12:01 PM, Matthew Brett<matthew.br...@gmail.com> > wrote: > >> Hi, >> >> Following on from the occasional discussion on the list, can I propose >> a small matrix_rank function for inclusion in numpy/linalg? >> >> I suggest it because it seems rather a basic need for linear algebra, >> and it's very small and simple... >> >> I've appended an implementation with some doctests in the hope that it >> will be acceptable, >> >> Robert - I hope you don't mind me quoting you in the notes. >> >> Thanks a lot, >> >> Matthew >> >> >> def matrix_rank(M, tol=None): >> ''' Return rank of matrix using SVD method >> >> Rank of the array is the number of SVD singular values of the >> array that are greater than `tol`. >> >> Parameters >> ---------- >> M : array-like >> array of<=2 dimensions >> tol : {None, float} >> threshold below which SVD values are considered zero. If `tol` >> is None, and `S` is an array with singular values for `M`, and >> `eps` is the epsilon value for datatype of `S`, then `tol` set >> to ``S.max() * eps``. >> >> Examples >> -------- >> >>> matrix_rank(np.eye(4)) # Full rank matrix >> 4 >> >>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix >> 4 >> >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank >> 0 >> >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 >> 1 >> >>> matrix_rank(np.zeros((4,))) >> 0 >> >>> matrix_rank([1]) # accepts array-like >> 1 >> >> Notes >> ----- >> Golub and van Loan define "numerical rank deficiency" as using >> tol=eps*S[0] (note that S[0] is the maximum singular value and thus >> the 2-norm of the matrix). There really is not one definition, much >> like there isn't a single definition of the norm of a matrix. For >> example, if your data come from uncertain measurements with >> uncertainties greater than floating point epsilon, choosing a >> tolerance of about the uncertainty is probably a better idea (the >> tolerance may be absolute if the uncertainties are absolute rather >> than relative, even). When floating point roundoff is your concern, >> then "numerical rank deficiency" is a better concept, but exactly >> what the relevant measure of the tolerance is depends on the >> operations you intend to do with your matrix. [RK, numpy mailing >> list] >> >> References >> ---------- >> Matrix Computations by Golub and van Loan >> ''' >> M = np.asarray(M) >> if M.ndim> 2: >> raise TypeError('array should have 2 or fewer dimensions') >> if M.ndim< 2: >> return int(not np.all(M==0)) >> S = npl.svd(M, compute_uv=False) >> if tol is None: >> tol = S.max() * np.finfo(S.dtype).eps >> return np.sum(S> tol) >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > This was missing from numpy compared to matlab and gauss. > > If we put it in linalg next to np.linalg.cond, then we could shorten > the name to `rank`, since the meaning of np.linalg.rank should be > pretty obvious. > > Josef > _______________________________________________ > +1 for the function but we can not shorten the name because of existing numpy.rank() function.
Bruce _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion