Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-22 Thread Matthew Brett
Hi,

 I'm happy to write the doctests as tests.   My feeling is there is no
 objection to this function at the moment, so it would be reasonable,
 unless I hear otherwise, to commit to SVN.

Committed - with tests in tests_linalg.py - in revision 8029

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread David Warde-Farley
Hi Gael,

On 16-Dec-09, at 2:16 PM, Gael Varoquaux wrote:

 I was under the impression that we should
 direct users who have linalg problems to scipy, as it can do much  
 more.


I agree about pushing users in that direction, but I think that's  
mostly a consequence of all the wrapped Fortran routines that already  
exist there. If this thing can be implemented in a short Python  
function I don't see the harm in having it in NumPy.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread Fernando Perez
On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold jsseab...@gmail.com wrote:
 Presumably the doctests should be turned into
 actual tests (noting Robert's comment) to make it more likely that it
 gets in

Just curious: is there a policy against pure doctests in numpy? I've
always found that doctests and 'real tests' serve complementary
purposes and are both useful:

- small, clear tests that make for *illustrative* examples for the end
user should be left in the docstring, and picked up by the test suite
as normal doctests.

- tests with a lot more logic that get cumbersome to write as doctests
can go into 'normal' tests into the test suite.

- There's also a valid third category: for cases where it's convenient
to write the test interactively but one doesn't want the example in
the main docstring, putting a function in the test suite that simply
has the doctest as a docstring works (it may require a little
decorator, I don't recall).

I'm just wondering if there's a policy of requiring that all tests
become non-doctests...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread Charles R Harris
On Fri, Dec 18, 2009 at 8:21 PM, Fernando Perez fperez@gmail.comwrote:

 On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold jsseab...@gmail.com
 wrote:
  Presumably the doctests should be turned into
  actual tests (noting Robert's comment) to make it more likely that it
  gets in

 Just curious: is there a policy against pure doctests in numpy? I've
 always found that doctests and 'real tests' serve complementary
 purposes and are both useful:

 - small, clear tests that make for *illustrative* examples for the end
 user should be left in the docstring, and picked up by the test suite
 as normal doctests.

 - tests with a lot more logic that get cumbersome to write as doctests
 can go into 'normal' tests into the test suite.

 - There's also a valid third category: for cases where it's convenient
 to write the test interactively but one doesn't want the example in
 the main docstring, putting a function in the test suite that simply
 has the doctest as a docstring works (it may require a little
 decorator, I don't recall).

 I'm just wondering if there's a policy of requiring that all tests
 become non-doctests...


I don't think there is a policy, but there is a growing tradition to put
serious tests in a test suite and avoid making them doctests. Functional
tests need to be easy to read, maintain, and document, and doctests don't
really fit that description. Your policy for examples in docstrings looks
like a good one, though.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread josef . pktd
On Fri, Dec 18, 2009 at 10:21 PM, Fernando Perez fperez@gmail.com wrote:
 On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold jsseab...@gmail.com wrote:
 Presumably the doctests should be turned into
 actual tests (noting Robert's comment) to make it more likely that it
 gets in

 Just curious: is there a policy against pure doctests in numpy? I've
 always found that doctests and 'real tests' serve complementary
 purposes and are both useful:

 - small, clear tests that make for *illustrative* examples for the end
 user should be left in the docstring, and picked up by the test suite
 as normal doctests.

 - tests with a lot more logic that get cumbersome to write as doctests
 can go into 'normal' tests into the test suite.

 - There's also a valid third category: for cases where it's convenient
 to write the test interactively but one doesn't want the example in
 the main docstring, putting a function in the test suite that simply
 has the doctest as a docstring works (it may require a little
 decorator, I don't recall).

 I'm just wondering if there's a policy of requiring that all tests
 become non-doctests...

doctests have cross platform rendering/printing/formatting problems

1e-01 versus 1e-001  there was a test failure recently with, I think, cheby
also nans render differently, even scalar nan versus nans in arrays.

I don't know about differences across versions of python, numpy, ...
but doctests are very fragile for numbers because they also test the formatting.

Also, the precision is not as easy to control on a test-by-test basis
than with assert_almost_equal or similar and easier to adjust when the
implementation changes (e.g. in stats).

I think, doctests are faster to write but more work to maintain.

But I don't know of any official policy,
http://projects.scipy.org/numpy/wiki/TestingGuidelines#doctests
explains how to do them but no recommendation whether to use them or not.

Josef



 Cheers,

 f
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread Robert Kern
On Fri, Dec 18, 2009 at 21:21, Fernando Perez fperez@gmail.com wrote:
 On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold jsseab...@gmail.com wrote:
 Presumably the doctests should be turned into
 actual tests (noting Robert's comment) to make it more likely that it
 gets in

 Just curious: is there a policy against pure doctests in numpy? I've
 always found that doctests and 'real tests' serve complementary
 purposes and are both useful:

 - small, clear tests that make for *illustrative* examples for the end
 user should be left in the docstring, and picked up by the test suite
 as normal doctests.

 - tests with a lot more logic that get cumbersome to write as doctests
 can go into 'normal' tests into the test suite.

 - There's also a valid third category: for cases where it's convenient
 to write the test interactively but one doesn't want the example in
 the main docstring, putting a function in the test suite that simply
 has the doctest as a docstring works (it may require a little
 decorator, I don't recall).

 I'm just wondering if there's a policy of requiring that all tests
 become non-doctests...

My policy and rationale, which I believe is reflected in the docstring
standard, is that examples in the docstrings should put pedagogical
concerns above all others. In my experience, a properly robust doctest
sacrifices the readability, clarity, and terseness of a good example.
Thus, not all examples run as doctests, so docstrings are not added to
numpy's test suite. For example, for floating point functions, one
should use allclose(), etc. to test the results against the gold
standard. However, using that in the example makes it less clear what
is going on. Compare:

In [2]: np.sin(np.linspace(0, 2*np.pi, 10))
Out[2]:
array([  0.e+00,   6.42787610e-01,   9.84807753e-01,
 8.66025404e-01,   3.42020143e-01,  -3.42020143e-01,
-8.66025404e-01,  -9.84807753e-01,  -6.42787610e-01,
-2.44929360e-16])

In [4]: np.allclose(np.sin(np.linspace(0, 2*np.pi, 10)), array([0.0,
.642787610, .984807753, .866025404, .342020143, -.342020143,
-.866025404, -.984807753, -.642787610, 0.0]))
Out[4]: True

I certainly don't mind properly written doctests outside of the real
docstrings (e.g. in particular files under test/ that just contain
doctests); they're a handy way to write certain tests although they
have well known and annoying limits for numerical work. I just don't
want documentation examples to be doctests.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-18 Thread Fernando Perez
On Fri, Dec 18, 2009 at 8:10 PM, Robert Kern robert.k...@gmail.com wrote:
 My policy and rationale, which I believe is reflected in the docstring
 standard, is that examples in the docstrings should put pedagogical
 concerns above all others. In my experience, a properly robust doctest
 sacrifices the readability, clarity, and terseness of a good example.
 Thus, not all examples run as doctests, so docstrings are not added to
 numpy's test suite. For example, for floating point functions, one
 should use allclose(), etc. to test the results against the gold
 standard.

I think we mostly agree, up to a point: I also emphasized pedagogical
value, but I think it would be better in the long run if we could also
run all examples as part of the test suite.  This would act as a
protection against examples going stale due to changes in the
underlying code.  A false example is worse than no example at all.

I wonder if this couldn't be addressed by simply having a suitable set
of printing options wrapped up in a utility that doctests could all
use (4 digits only, setting certain flags, linewidth, etc).  This
could help with most of the problems with robustness and maintenance,
while allowing us to ensure that we can always guarantee that examples
actually do work.  Just a thought.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-16 Thread Skipper Seabold
On Tue, Dec 15, 2009 at 3:16 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Is it reasonable to summarize that, to avoid confusion, we keep
 'matrix_rank' as the name?

 I've edited as Robert suggested, attempting to adopt a more suitable
 tone in the docstring...

 Thanks a lot,

 Matthew



What comes next when someone offers up a useful function like this?
We are using an earlier version of in statsmodels and wouldn't mind
seeing this in numpy.  Presumably the doctests should be turned into
actual tests (noting Robert's comment) to make it more likely that it
gets in and an enhancement ticket should be filed?

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-16 Thread Matthew Brett
Hi,

 Is it reasonable to summarize that, to avoid confusion, we keep
 'matrix_rank' as the name?

 I've edited as Robert suggested, attempting to adopt a more suitable
 tone in the docstring...

 What comes next when someone offers up a useful function like this?
 We are using an earlier version of in statsmodels and wouldn't mind
 seeing this in numpy.  Presumably the doctests should be turned into
 actual tests (noting Robert's comment) to make it more likely that it
 gets in and an enhancement ticket should be filed?

I'm happy to write the doctests as tests.   My feeling is there is no
objection to this function at the moment, so it would be reasonable,
unless I hear otherwise, to commit to SVN.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-16 Thread Skipper Seabold
On Wed, Dec 16, 2009 at 2:13 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Is it reasonable to summarize that, to avoid confusion, we keep
 'matrix_rank' as the name?

 I've edited as Robert suggested, attempting to adopt a more suitable
 tone in the docstring...

 What comes next when someone offers up a useful function like this?
 We are using an earlier version of in statsmodels and wouldn't mind
 seeing this in numpy.  Presumably the doctests should be turned into
 actual tests (noting Robert's comment) to make it more likely that it
 gets in and an enhancement ticket should be filed?

 I'm happy to write the doctests as tests.   My feeling is there is no
 objection to this function at the moment, so it would be reasonable,
 unless I hear otherwise, to commit to SVN.


Sounds good.  I didn't know you had commit privileges, and I didn't
want to see this get lost in the shuffle.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-16 Thread Gael Varoquaux
On Wed, Dec 16, 2009 at 02:13:08PM -0500, Matthew Brett wrote:
 I'm happy to write the doctests as tests.   My feeling is there is no
 objection to this function at the moment, so it would be reasonable,
 unless I hear otherwise, to commit to SVN.

I have one small comment: I am really happy to see you working on this
function. It will be very useful, and its great to have it in a
generaly-available package. However, I wonder whether it belongs to
numpy.linalg, or scipy.linalg. I was under the impression that we should
direct users who have linalg problems to scipy, as it can do much more.

My 2 cents,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-16 Thread Matthew Brett
Hi,

On Wed, Dec 16, 2009 at 2:16 PM, Gael Varoquaux
gael.varoqu...@normalesup.org wrote:
 On Wed, Dec 16, 2009 at 02:13:08PM -0500, Matthew Brett wrote:
 I'm happy to write the doctests as tests.   My feeling is there is no
 objection to this function at the moment, so it would be reasonable,
 unless I hear otherwise, to commit to SVN.

 I have one small comment: I am really happy to see you working on this
 function. It will be very useful, and its great to have it in a
 generaly-available package. However, I wonder whether it belongs to
 numpy.linalg, or scipy.linalg. I was under the impression that we should
 direct users who have linalg problems to scipy, as it can do much more.

It's another option, and one I thought of, but in this case, the use
is so ubiquitous in linear algebra, and the function so small, that it
would seem a pity to require installing scipy to get it.

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Matthew Brett
Hi,

Following on from the occasional discussion on the list, can I propose
a small matrix_rank function for inclusion in numpy/linalg?

I suggest it because it seems rather a basic need for linear algebra,
and it's very small and simple...

I've appended an implementation with some doctests in the hope that it
will be acceptable,

Robert - I hope you don't mind me quoting you in the notes.

Thanks a lot,

Matthew


def matrix_rank(M, tol=None):
''' Return rank of matrix using SVD method

Rank of the array is the number of SVD singular values of the
array that are greater than `tol`.

Parameters
--
M : array-like
array of =2 dimensions
tol : {None, float}
 threshold below which SVD values are considered zero. If `tol`
 is None, and `S` is an array with singular values for `M`, and
 `eps` is the epsilon value for datatype of `S`, then `tol` set
 to ``S.max() * eps``.

Examples

 matrix_rank(np.eye(4)) # Full rank matrix
4
 matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix
4
 matrix_rank(np.zeros((4,4))) # All zeros - zero rank
0
 matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0
1
 matrix_rank(np.zeros((4,)))
0
 matrix_rank([1]) # accepts array-like
1

Notes
-
Golub and van Loan define numerical rank deficiency as using
tol=eps*S[0] (note that S[0] is the maximum singular value and thus
the 2-norm of the matrix). There really is not one definition, much
like there isn't a single definition of the norm of a matrix. For
example, if your data come from uncertain measurements with
uncertainties greater than floating point epsilon, choosing a
tolerance of about the uncertainty is probably a better idea (the
tolerance may be absolute if the uncertainties are absolute rather
than relative, even). When floating point roundoff is your concern,
then numerical rank deficiency is a better concept, but exactly
what the relevant measure of the tolerance is depends on the
operations you intend to do with your matrix. [RK, numpy mailing
list]

References
--
Matrix Computations by Golub and van Loan
'''
M = np.asarray(M)
if M.ndim  2:
raise TypeError('array should have 2 or fewer dimensions')
if M.ndim  2:
return int(not np.all(M==0))
S = npl.svd(M, compute_uv=False)
if tol is None:
tol = S.max() * np.finfo(S.dtype).eps
return np.sum(S  tol)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread josef . pktd
On Tue, Dec 15, 2009 at 12:01 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Following on from the occasional discussion on the list, can I propose
 a small matrix_rank function for inclusion in numpy/linalg?

 I suggest it because it seems rather a basic need for linear algebra,
 and it's very small and simple...

 I've appended an implementation with some doctests in the hope that it
 will be acceptable,

 Robert - I hope you don't mind me quoting you in the notes.

 Thanks a lot,

 Matthew


 def matrix_rank(M, tol=None):
    ''' Return rank of matrix using SVD method

    Rank of the array is the number of SVD singular values of the
    array that are greater than `tol`.

    Parameters
    --
    M : array-like
        array of =2 dimensions
    tol : {None, float}
         threshold below which SVD values are considered zero. If `tol`
         is None, and `S` is an array with singular values for `M`, and
         `eps` is the epsilon value for datatype of `S`, then `tol` set
         to ``S.max() * eps``.

    Examples
    
     matrix_rank(np.eye(4)) # Full rank matrix
    4
     matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix
    4
     matrix_rank(np.zeros((4,4))) # All zeros - zero rank
    0
     matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0
    1
     matrix_rank(np.zeros((4,)))
    0
     matrix_rank([1]) # accepts array-like
    1

    Notes
    -
    Golub and van Loan define numerical rank deficiency as using
    tol=eps*S[0] (note that S[0] is the maximum singular value and thus
    the 2-norm of the matrix). There really is not one definition, much
    like there isn't a single definition of the norm of a matrix. For
    example, if your data come from uncertain measurements with
    uncertainties greater than floating point epsilon, choosing a
    tolerance of about the uncertainty is probably a better idea (the
    tolerance may be absolute if the uncertainties are absolute rather
    than relative, even). When floating point roundoff is your concern,
    then numerical rank deficiency is a better concept, but exactly
    what the relevant measure of the tolerance is depends on the
    operations you intend to do with your matrix. [RK, numpy mailing
    list]

    References
    --
    Matrix Computations by Golub and van Loan
    '''
    M = np.asarray(M)
    if M.ndim  2:
        raise TypeError('array should have 2 or fewer dimensions')
    if M.ndim  2:
        return int(not np.all(M==0))
    S = npl.svd(M, compute_uv=False)
    if tol is None:
        tol = S.max() * np.finfo(S.dtype).eps
    return np.sum(S  tol)
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


This was missing from numpy compared to matlab and gauss.

If we put it in linalg next to np.linalg.cond,  then we could shorten
the name to `rank`, since the meaning of np.linalg.rank should be
pretty obvious.

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Bruce Southey
On 12/15/2009 11:12 AM, josef.p...@gmail.com wrote:
 On Tue, Dec 15, 2009 at 12:01 PM, Matthew Brettmatthew.br...@gmail.com  
 wrote:

 Hi,

 Following on from the occasional discussion on the list, can I propose
 a small matrix_rank function for inclusion in numpy/linalg?

 I suggest it because it seems rather a basic need for linear algebra,
 and it's very small and simple...

 I've appended an implementation with some doctests in the hope that it
 will be acceptable,

 Robert - I hope you don't mind me quoting you in the notes.

 Thanks a lot,

 Matthew


 def matrix_rank(M, tol=None):
 ''' Return rank of matrix using SVD method

 Rank of the array is the number of SVD singular values of the
 array that are greater than `tol`.

 Parameters
 --
 M : array-like
 array of=2 dimensions
 tol : {None, float}
  threshold below which SVD values are considered zero. If `tol`
  is None, and `S` is an array with singular values for `M`, and
  `eps` is the epsilon value for datatype of `S`, then `tol` set
  to ``S.max() * eps``.

 Examples
 
   matrix_rank(np.eye(4)) # Full rank matrix
 4
   matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix
 4
   matrix_rank(np.zeros((4,4))) # All zeros - zero rank
 0
   matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0
 1
   matrix_rank(np.zeros((4,)))
 0
   matrix_rank([1]) # accepts array-like
 1

 Notes
 -
 Golub and van Loan define numerical rank deficiency as using
 tol=eps*S[0] (note that S[0] is the maximum singular value and thus
 the 2-norm of the matrix). There really is not one definition, much
 like there isn't a single definition of the norm of a matrix. For
 example, if your data come from uncertain measurements with
 uncertainties greater than floating point epsilon, choosing a
 tolerance of about the uncertainty is probably a better idea (the
 tolerance may be absolute if the uncertainties are absolute rather
 than relative, even). When floating point roundoff is your concern,
 then numerical rank deficiency is a better concept, but exactly
 what the relevant measure of the tolerance is depends on the
 operations you intend to do with your matrix. [RK, numpy mailing
 list]

 References
 --
 Matrix Computations by Golub and van Loan
 '''
 M = np.asarray(M)
 if M.ndim  2:
 raise TypeError('array should have 2 or fewer dimensions')
 if M.ndim  2:
 return int(not np.all(M==0))
 S = npl.svd(M, compute_uv=False)
 if tol is None:
 tol = S.max() * np.finfo(S.dtype).eps
 return np.sum(S  tol)
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

  
 This was missing from numpy compared to matlab and gauss.

 If we put it in linalg next to np.linalg.cond,  then we could shorten
 the name to `rank`, since the meaning of np.linalg.rank should be
 pretty obvious.

 Josef
 ___

+1 for the function but we can not shorten the name because of existing 
numpy.rank() function.


Bruce
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Matthew Brett
Hi,

 +1 for the function but we can not shorten the name because of existing
 numpy.rank() function.

I don't feel strongly about the name, but I imagine you could do

from numpy.linalg import rank as matrix_rank

if you weren't using the numpy.linalg namespace already...

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Alan G Isaac
On 12/15/2009 1:39 PM, Bruce Southey wrote:
 +1 for the function but we can not shorten the name because of existing
 numpy.rank() function.

1. Is it a rule that there cannot be a name duplication
in this different namespace?
2. Is there a commitment to keeping both np.rank and np.ndim?
(I.e., can np.rank never be deprecated?)

If the answers are both 'yes',
then perhaps linalg.rank2d is a possible shorter name.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Bruce Southey
On 12/15/2009 12:47 PM, Alan G Isaac wrote:
 On 12/15/2009 1:39 PM, Bruce Southey wrote:

 +1 for the function but we can not shorten the name because of existing
 numpy.rank() function.
  
 1. Is it a rule that there cannot be a name duplication
 in this different namespace?

In my view this is still the same numpy namespace. An example of the 
potential problems is just using an incorrect import statement somewhere:
from numpy import rank
instead of
from numpy.linalg import rank

For a package you control, you should really prevent this type of user 
mistake.


 2. Is there a commitment to keeping both np.rank and np.ndim?
 (I.e., can np.rank never be deprecated?)


I do not see that is practical because of the number of releases to 
actually remove a function.
Also the current rank function has existed for a very long time in 
Numerical Python (as it is present in Numeric). So it could be confusing 
for a user to think that the function just has been moved rather than 
being a different function.

 If the answers are both 'yes',
 then perhaps linalg.rank2d is a possible shorter name.

 Alan Isaac
 ___

0

Actually I do interpret rank in terms of linear algebra definition, but 
obviously other people have other meanings. I


Bruce
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Robert Kern
On Tue, Dec 15, 2009 at 11:01, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Following on from the occasional discussion on the list, can I propose
 a small matrix_rank function for inclusion in numpy/linalg?

 I suggest it because it seems rather a basic need for linear algebra,
 and it's very small and simple...

 I've appended an implementation with some doctests in the hope that it
 will be acceptable,

I think you need a real example of a nontrivial numerically
rank-deficient matrix. Note that c_[eye(4), eye(4)] is actually a
full-rank matrix. A matrix is full rank if its numerical rank is equal
to min(rows, cols) not max(rows, cols). Taking I=eye(4); I[-1,-1] =
0.0 should be a sufficient example.

 Robert - I hope you don't mind me quoting you in the notes.

I certainly. However, you do not need to cite me; I'm in the authors
list already. On the other hand, you probably shouldn't copy-and-paste
anything I write on the mailing list to use in a docstring. On the
mailing list, I am answering a particular question and use a different
voice than is appropriate for a docstring.

Also, a full citation of Golub and Van Loan would be appropriate:

.. [1] G. H. Golub and C. F. Van Loan, _Matrix Computations_.
Baltimore: Johns Hopkins University Press, 1996.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for matrix_rank function in numpy

2009-12-15 Thread Matthew Brett
Hi,

Is it reasonable to summarize that, to avoid confusion, we keep
'matrix_rank' as the name?

I've edited as Robert suggested, attempting to adopt a more suitable
tone in the docstring...

Thanks a lot,

Matthew


def matrix_rank(M, tol=None):
''' Return rank of matrix using SVD method

Rank of the array is the number of SVD singular values of the
array that are greater than `tol`.

Parameters
--
M : array-like
array of =2 dimensions
tol : {None, float}
 threshold below which SVD values are considered zero. If `tol`
 is None, and `S` is an array with singular values for `M`, and
 `eps` is the epsilon value for datatype of `S`, then `tol` set
 to ``S.max() * eps``.

Examples

 matrix_rank(np.eye(4)) # Full rank matrix
4
 I=np.eye(4); I[-1,-1] = 0. # rank deficient matrix
 matrix_rank(I)
3
 matrix_rank(np.zeros((4,4))) # All zeros - zero rank
0
 matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0
1
 matrix_rank(np.zeros((4,)))
0
 matrix_rank([1]) # accepts array-like
1

Notes
-
Golub and van Loan [1]_ define numerical rank deficiency as using
tol=eps*S[0] (where S[0] is the maximum singular value and thus the
2-norm of the matrix). This is one definition of rank deficiency,
and the one we use here.  When floating point roundoff is the main
concern, then numerical rank deficiency is a reasonable choice. In
some cases you may prefer other definitions. The most useful measure
of the tolerance depends on the operations you intend to use on your
matrix. For example, if your data come from uncertain measurements
with uncertainties greater than floating point epsilon, choosing a
tolerance near that uncertainty may be preferable.  The tolerance
may be absolute if the uncertainties are absolute rather than
relative.

References
--
.. [1] G. H. Golub and C. F. Van Loan, _Matrix Computations_.
Baltimore: Johns Hopkins University Press, 1996.
'''
M = np.asarray(M)
if M.ndim  2:
raise TypeError('array should have 2 or fewer dimensions')
if M.ndim  2:
return int(not np.all(M==0))
S = npl.svd(M, compute_uv=False)
if tol is None:
tol = S.max() * np.finfo(S.dtype).eps
return np.sum(S  tol)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion