Tim Wrote:
That second argument is particularly uncompelling, but I think I agree
that in a vacuum swapaxes(-2,-1) would be a better choice for .T than
reversing the axes. However, we're not in a vacuum and there are several
reasons not to do this.
   1. A.T and A.transpose() should really have the same behavior.

There may be a certain economy to that but I don't see why it should necessarily be so.  Especially if it's agreed that the behavior .transpose() is not very useful.  The use case for .T is primarily to make linear algebra stuff easier.  If you're doing n-dim stuff and need something specific, you'll use the more general .transpose().

   2. Changing A.transpose would be one more backwards compatibility issue.

Maybe it's a change worth making though, if we are right in saying that the current .transpose() for ndim>2 is hardly ever what you want.

   3. Since, as far as I can tell there's not concise way of spelling
A.swapaxes(-2,-1) in terms of A.transpose it would make documenting and
explaining the default case harder.

Huh?  A.swapaxes (-2,-1) is pretty concise.  Why should it have to have an explanation in terms of A.transpose?  Here's the explanation for the documentation: "A.T returns A with the last two axes transposed.  It is equivalent to A.swapaxes (-2,-1).  For a 2-d array, this is the usual matrix transpose."  This just is a non-issue.


Sasha wrote:
> more common to want to swap just two axes, and the last two seem a logical
> choice since a) in the default C-ordering they're the closest together in
> memory and b) they're the axes that are printed contiguously when you say
> "print A".

It all depends on how you want to interpret a rank-K tensor.  You seem
to advocate a view that it is a (K-2)-rank array of matrices and .T is
an element-wise transpose operation. Alternatively I can expect that
it is a matrix of (K-2)-rank arrays and then .T should be
swapaxes(0,1).  Do you have real-life applications of swapaxes(-2,-1)
for rank > 2?

Yep, like Tim said.  The usage is say a N sets of basis vectors.  Each set of basis vectors is a matrix.  And say I have a different basis associated with each of N points in space.  Usually I'll want to print it out organized by basis vector set.  I.e. look at the matrix associated with each of the points.  So it makes sense to organize it as shape=(N,a,b) so that if I print it I get something that's easy to interpret.  If I set it up as shape=(a,b,N) then what's easiest to see in the print output is all N first basis vectors, all N second basis vectors, etc.  Also again in a C memory layout, the last two axes are closest in memory, so it's more cache friendly to have the bits that will usually be used together in computations be on the trailing end.  In matlab (which is fortran order), I do things the other way, with the N at the end of the shape.  (And note that Matlab prints out the first two axes contiguously.)

Either way swapaxes(-2,-1) is likely more likely to be what you want than .transpose().


> > and swapaxes(-2,-1) is
> > invalid for rank < 2.
> >
>  At least in numpy 0.9.8, it's not invalid, it just doesn't do anything.
>

That's bad.  What sense does it make to swap non-existing axes? Many
people would expect transpose of a vector to be a matrix.  This is the
case in S+ and R.

Well, I would be really happy for .T to return an (N,1) column vector if handed an (N,) 1-d array.  But I'm pretty sure that would raise more furuor among the readers of the list than leaving it 1-d.

> > My main objection is that a.T is fairly cryptic
> > - is there any other language that uses attribute for transpose?
>
>
> Does it matter what other languages do?  It's not _that_ cryptic.

If something is clear and natural, chances are it was done before.

The thing is most other numerical computing languages were designed for doing numerical computing.  They weren't designed originally for writing general purpose software, like Python was.  So in matlab, for instance, transpose is a simple single-quote.  But that doesn't help us decide what it should be in numpy.

For me prior art is always a useful guide when making a design choice.
 For example, in R, the transpose operation is t(a) and works on rank
<= 2 only always returning rank-2.  

I have serious reservations about a function called t().  x,y,z, and t are probably all in the top 10 variable names in scientific computing.

K (an APL-like language) overloads
unary '+' to do swapaxes(0,1) for rank>=2 and nothing for lower rank.

Hmm.  That's kind of interesting, it seems like an abuse of notation to me.  And precedence might be an issue too.  The precedence of unary + isn't as high as attribute access.  Anyway, as far as the meaning of + in K, I'm guessing K's arrays are in Fortran order, so (0,1) axes vary the fastest.  I couldn't find any documentation for the K language from a quick search, though. 

Both R and K solutions are implementable in Python with R using 3
characters and K using 1(!) compared to your two-character ".T"
notation.  I would suggest that when inventing something new, you
should consider prior art and explain how you invention is better.
That's why what other languages do matter. (After all, isn't 'T'
chosen because "transpose" starts with "t" in the English language?)

Yes you're right.  My main thought was just what I said above, that there probably aren't too many other examples that can really apply in this case, both because most numerical computing languages are custom-designed for numerical computing, and also because Python's attributes are also kind of uncommon among programming languages.  So it's worth looking at other examples, but in the end it has to be something that makes sense for a numerical computing package written in Python, and there aren't too many examples of that around.

>  You could write a * b.transpose(1,0)
> right now and still not know whether it was matrix or element-wise
> multiplication.

Why would anyone do that if b was a matrix?

Maybe because, like you, they think "
that a.T is fairly cryptic".

> But probably a better solution
> would be to have matrix versions of these in the library as an optional
> module to import so people could, say, import them as M and use M.ones(2,2).
>

This is the solution used by ma, which is another argument for it.

Yeh, I'm starting to think that's better than slapping an M attribute on arrays, too.  Is it hard to write a module like that?

I only raised a mild objection against .T, but the slippery slope
argument makes me dislike it much more.  At the very least I would
like to see a discussion of why a.T is better than t(a) or +a.

*  A.T puts the T on the proper side of A, so in that sense it looks more like the standard math notation.
*  A.T has precedence that roughly matches the standard math notation
*  t(A) uses an impossibly short function name that's likely to conflict with local variable names.  To avoid the conflict people will just end up using it as numpy.t(A), at which point it's value as a shortcut for transpose is nullified.  Or they'll have to do a mini-import within specific functions ("from numpy import t") to localize the namespace pollution.  But at that point they might as well just say  " t= numpy.transpose". 
* t(A) puts the transpose operator on the wrong side of A
* +A puts the transpose operator on the wrong side of A also.
* +A implies addition.  The general rule with operator overloading is that the overload should have the same general meaning as the original operator.  So overloading * for matrix multiplication makes sense.  Overloading & for would be a bad idea.  New users looking at something like A + +B are pretty certain to be confused because they think they know what + means, but they're wrong.  If you see A + B.T, you either know what it means or you know immediately that you don't know what it means and you go look it up.
* +A has different precedence than the usual transpose operator.  (But I can't think of a case where that would make a difference now.)


Tim Hochberg wrote:
> Well, you could overload __rpow__ for a singleton T and spell it A**T
> ... (I hope no one will take that proposal seriosely).   Visually, A.T
> looks more like a subscript rather than superscript.
>
No, no no. Overload __rxor__, then you can spell it A^t, A^h, etc. Much
better ;-).  [Sadly, I almost like that....]

Ouch!  No way!  It's got even worse precedence problems than the +A proposal.  How about A+B^t ? And you still have to introduce 'h' and 't' into the global namespace for it to work.

Here's a half baked thought: if the objection to t(A) is that it doesn't
mirror the formulae where t appears as a subscript after A. Conceivably,
__call__ could be defined so that A(x) returns x(A). That's kind of
perverse, but it means that A(t), A(h), etc. could all work
appropriately for suitably defined singletons. These singletons could
either be assembeled in some abbreviations namespace or brought in by
the programmer using "import transpose as t", etc. The latter works for
doing t(a) as well of course.

Same problem with the need for global t.  And it is kind of perverse, besides.


Robert Kern wrote:
Like Sasha, I'm mildly opposed to .T (as a synonym for .transpose()) and much
more opposed to the rest (including .T being a synonym for .swapaxes(-2, -1)).
It's not often that a proposal carries with it its own slippery-slope argument
against itself.

The slippery slope argument only applies to the .M, not the .T or .H.  And I think if there's a matrixutils module with redefinitions of ones and zeros etc, and if other functions are all truly fixed to preserve matrix when matrix is passed in, then I agree, there's not so much need for .M.

I don't think that just because arrays are often used for linear algebra that
linear algebra assumptions should be built in to the core array type.

It's not just that "arrays can be used for linear algebra".  It's that linear algebra is the single most popular kind of numerical computing in the world!  It's the foundation for a countless many fields.   What you're saying is like "grocery stores shouldn't devote so much shelf space to food, because food is just one of the products people buy", or "this mailing list shouldn't be conducted in English, because English is just one of the languages people can speak here", or "I don't think my keyboard should devote so much space to the A-Z keys, because there are so many characters in the Unicode character set that could be there instead", or to quote from a particular comedy troop:

"Ah, how about Cheddar?"
"Well, we don't get much call for it around here, sir."
"Not much ca- It's the single most popular cheese in the world!"
"Not round here, sir."
 
Linear algebra is pretty much the 'cheddar' of the numerical computing world.  But it's more than that.  It's like the yeast of the beer world.  Pretty much everything starts with it as a base.  It makes sense to make it as convenient as possible to do with numpy, even if it is a "special case".  I wish I could think of some sort of statistics or google search I could cite to back this claim up, but as far as my academic background from high school though Ph.D. goes, linear algebra is a mighty big deal, not merely an "also ran" in the world of math or numerical computing.

Sasha Wrote:
In addition, transpose is a (rank-2) array or matrix operation and not
a linear algebra operation.  Transpose corresponds to the "adjoint"
linear algebra operation if you represent vectors as single column
matrices and co-vectors as single-row matrices.  This is a convenient
representation followed by much of the relevant literature, but it
does not alow generalization beyond rank-2. 

I would be willing to accept a .T that just threw an exception if ndim were > 2.  That's what Matlab does with its transpose operator.  I don't like that behavior myself -- it seems wasteful when it could just have some well defined behavior that would let it be useful at least some of the time on N-d arrays.

I don't like it either, but I don't like .T even more.  These days I
hate functionality I cannot google for.  Call me selfish, but I
already know what unary '+' can do to a higher rank array, but with .T
I will always have to look up which axes it swaps ...

I think '.T'  is more likely to be searchable than '+'.   And when you say you already know what unary + can do, you mean because you've used K?  That's not much use to the typical user, who also thinks they know what a unary + does, but they'd be wrong in this case.


So, in summary, I vote for:
- Keep the .T and the .H on array
- Get rid of .M
- Instead implement a matrix helper module that could be imported as M, allowing M.ones(...) etc.

And also:
- Be diligent about fixing any errors from matrix users along the lines of " numpy.foo returns an array when given a matrix" (Travis has been good about this -- but we need to keep it up.)  Part of the motivation for .M attribute was just as a band-aid on the problem of matrices getting turned into arrays.  Having .M means you can just slap a .M on the end of any result you aren't sure about.  It's better (but harder) to fix the upstream problem of functions not preserving subtypes.


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Reply via email to