Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread Anne Archibald
2009/12/23 David Goldsmith d.l.goldsm...@gmail.com:
 Starting a new thread for this.

 On Tue, Dec 22, 2009 at 7:13 PM, Anne Archibald
 peridot.face...@gmail.com wrote:

 I think we have one major lacuna: vectorized linear algebra. If I have
 to solve a whole whack of four-dimensional linear systems, right now I
 need to either write a python loop and use linear algebra on them one
 by one, or implement my own linear algebra. It's a frustrating lacuna,
 because all the machinery is there: generalized ufuncs and LAPACK
 wrappers. Somebody just needs to glue them together. I've even tried
 making a start on it, but numpy's ufunc machinery and generic type
 system is just too much of a pain for me to make any progress as is.

 Please be more specific: what (which aspects) have been too much of a
 pain?  (I ask out of ignorance, not out of challenging your
 opinion/experience.)

It's been a little while since I took a really close look at it, but
I'll try to describe the problems I had. Chiefly I had problems with
documentation - the only way I could figure out how to build
additional gufuncs was monkey-see-monkey-do, just copying an existing
one in an existing file and hoping the build system figured it out. It
was also not at all clear how to, say, link to LAPACK, let alone
decide based on input types which arguments to promote and how to call
out to LAPACK.

I'm not saying this is impossible, just that it was enough frustrating
no-progress to defeat my initial hey, I could do that impulse.

 I think if someone wanted to start building a low-level

 Again, please be more specific: what do you mean by this?  (I know
 generally what is meant by low level, but I'd like you to spell out
 a little more fully what you mean by this in this context.)

Sure. Let me first say that all this is kind of beside the point - the
hard part is not designing an API, so it's a bit silly to dream up an
API without implementing anything.

I had pictured two interfaces to the vectorized linear algebra code.
The first would simply provide more-or-less direct access to
vectorized versions of the linear algebra functions we have now, with
no dimension inference. Thus inv, pinv, svd, lu factor, lu solve, et
cetera - but not dot. Dot would have to be split up into
vector-vector, vector-matrix, matrix-vector, and matrix-matrix
products, since one can no longer use the dimensionality of the inputs
to figure out what is wanted. The key idea would be that the linear
algebra dimensions would always be the last one(s); this is fairly
easy to arrange with rollaxis when it isn't already true, would tend
to reduce copying on input to LAPACK, and is what the gufunc API
wants.

This is mostly what I meant by low-level. (A second generation would
do things like combine many vector-vector products into a single
LAPACK matrix-vector product.)

The higher-level API I was imagining - remember, vaporware here - had
a Matrix and a Vector class, each holding an arbitrarily-dimensioned
array of the relevant object. The point of this is to avoid having to
constantly specify whether you want a matrix-vector or matrix-matrix
product; it also tidily avoids the always-two-dimensional nuisance of
the current matrix API.


Anne
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread David Warde-Farley
On 23-Dec-09, at 10:34 AM, Anne Archibald wrote:

 It's been a little while since I took a really close look at it, but
 I'll try to describe the problems I had. Chiefly I had problems with
 documentation - the only way I could figure out how to build
 additional gufuncs was monkey-see-monkey-do, just copying an existing
 one in an existing file and hoping the build system figured it out. It
 was also not at all clear how to, say, link to LAPACK, let alone
 decide based on input types which arguments to promote and how to call
 out to LAPACK.

I tried to create a new generalized ufunc (a logsumexp to go with  
logaddexp, so as to avoid all the needless exp's and log's that would  
be incurred by logaddexp.reduce) and had exactly the same problem. I  
did get it to build but it was misbehaving (returning an array of the  
same size as the input) and I couldn't figure out quite why. I agree  
that the documentation is lacking, but I think it's (rightly) a low  
priority in the midst of the release candidate.

 The key idea would be that the linear
 algebra dimensions would always be the last one(s); this is fairly
 easy to arrange with rollaxis when it isn't already true, would tend
 to reduce copying on input to LAPACK, and is what the gufunc API
 wants.

Would it actually reduce copying if you were using default C-ordered  
arrays? Maybe I'm mistaken but I thought one almost always had to copy  
in order to translate C to Fortran order except for a few functions  
that can take row-ordered stuff.

Otherwise, +1 all the way.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread David Goldsmith
On Wed, Dec 23, 2009 at 10:30 AM, David Warde-Farley d...@cs.toronto.edu 
wrote:
 On 23-Dec-09, at 10:34 AM, Anne Archibald wrote:

 It's been a little while since I took a really close look at it, but
 I'll try to describe the problems I had. Chiefly I had problems with
 documentation - the only way I could figure out how to build
 additional gufuncs was monkey-see-monkey-do, just copying an existing
 one in an existing file and hoping the build system figured it out. It
 was also not at all clear how to, say, link to LAPACK, let alone
 decide based on input types which arguments to promote and how to call
 out to LAPACK.

 I tried to create a new generalized ufunc (a logsumexp to go with
 logaddexp, so as to avoid all the needless exp's and log's that would
 be incurred by logaddexp.reduce) and had exactly the same problem. I
 did get it to build but it was misbehaving (returning an array of the
 same size as the input) and I couldn't figure out quite why. I agree
 that the documentation is lacking, but I think it's (rightly) a low
 priority in the midst of the release candidate.

Thanks Anne (and Dave): it may seem to you to be a bit silly to dream
up an API without implementing anything, but I think it's useful to
get these things on the record so to speak, and as a person charged
with being especially concerned w/ the doc, it's particularly
important for me to hear when its specific deficiencies are
productivity blockers...

 The key idea would be that the linear
 algebra dimensions would always be the last one(s); this is fairly
 easy to arrange with rollaxis when it isn't already true, would tend
 to reduce copying on input to LAPACK, and is what the gufunc API
 wants.

 Would it actually reduce copying if you were using default C-ordered
 arrays? Maybe I'm mistaken but I thought one almost always had to copy
 in order to translate C to Fortran order except for a few functions
 that can take row-ordered stuff.

 Otherwise, +1 all the way.

...and of course, discussing these things here begins a dialog that
can be the beginning of getting these improvements made - not
necessarily by you... :-)

Thanks again, for humoring me

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread Anne Archibald
2009/12/23 David Warde-Farley d...@cs.toronto.edu:
 On 23-Dec-09, at 10:34 AM, Anne Archibald wrote:

 The key idea would be that the linear
 algebra dimensions would always be the last one(s); this is fairly
 easy to arrange with rollaxis when it isn't already true, would tend
 to reduce copying on input to LAPACK, and is what the gufunc API
 wants.

 Would it actually reduce copying if you were using default C-ordered
 arrays? Maybe I'm mistaken but I thought one almost always had to copy
 in order to translate C to Fortran order except for a few functions
 that can take row-ordered stuff.

That's a good point. But even if you need to do a transpose, it'll be
faster to transpose data in a contiguous block than data scattered all
over memory. Maybe more to the point, broadcasting adds axes to the
beginning, so that (say) two-dimensional arrays can act as matrix
scalars.

Anne
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread David Warde-Farley

On 23-Dec-09, at 2:19 PM, David Goldsmith wrote:

 Thanks Anne (and Dave): it may seem to you to be a bit silly to dream
 up an API without implementing anything, but I think it's useful to
 get these things on the record so to speak, and as a person charged
 with being especially concerned w/ the doc, it's particularly
 important for me to hear when its specific deficiencies are
 productivity blockers...

In fact, there are gufuncs in the tests that are quite instructive and  
would form the basis of good documentation, though not enough of them  
to give a complete picture of what the generalized ufunc architecture  
can do (I remember looking for an example of a particular supported  
pattern and coming up short, though I can't for the life of me  
remember which).

The existing documentation, plus source code from the umath_tests  
module marked up descriptively (what all the parameters do, especially  
the ones which currently receive magic numbers) would probably be the  
way to go down the road.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread David Goldsmith
On Wed, Dec 23, 2009 at 2:26 PM, David Warde-Farley d...@cs.toronto.edu wrote:

 On 23-Dec-09, at 2:19 PM, David Goldsmith wrote:

 Thanks Anne (and Dave): it may seem to you to be a bit silly to dream
 up an API without implementing anything, but I think it's useful to
 get these things on the record so to speak, and as a person charged
 with being especially concerned w/ the doc, it's particularly
 important for me to hear when its specific deficiencies are
 productivity blockers...

 In fact, there are gufuncs in the tests that are quite instructive and
 would form the basis of good documentation, though not enough of them
 to give a complete picture of what the generalized ufunc architecture
 can do (I remember looking for an example of a particular supported
 pattern and coming up short,

If you came up short, how/why are you certain that the existing arch
would support it?

 though I can't for the life of me
 remember which).

 The existing documentation, plus source code from the umath_tests
 module marked up descriptively (what all the parameters do, especially
 the ones which currently receive magic numbers) would probably be the
 way to go down the road.

 David

Perfect, David!  Thanks...

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?)

2009-12-23 Thread David Warde-Farley
On Wed, Dec 23, 2009 at 05:30:16PM -0800, David Goldsmith wrote:
 On Wed, Dec 23, 2009 at 2:26 PM, David Warde-Farley d...@cs.toronto.edu 
 wrote:
 
  On 23-Dec-09, at 2:19 PM, David Goldsmith wrote:
 
  Thanks Anne (and Dave): it may seem to you to be a bit silly to dream
  up an API without implementing anything, but I think it's useful to
  get these things on the record so to speak, and as a person charged
  with being especially concerned w/ the doc, it's particularly
  important for me to hear when its specific deficiencies are
  productivity blockers...
 
  In fact, there are gufuncs in the tests that are quite instructive and
  would form the basis of good documentation, though not enough of them
  to give a complete picture of what the generalized ufunc architecture
  can do (I remember looking for an example of a particular supported
  pattern and coming up short,
 
 If you came up short, how/why are you certain that the existing arch
 would support it?

The existing documentation made the capabilities of generalized ufuncs
pretty clear, however not much is demonstrated in terms of the appropriate
C API (or code generator) constructs.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion