To elaborate a little on such a more general and explicit method of specifying linear operations (perhaps 'expressions with named axes' is a good nomer to cover this topic).
I think indexing rather than calling is preferable. I worried at first about the performance overhead of checking for strings at every indexing op, but get ndarray__getitem__ is already quite a complex beast anyway, and adding (yet another) type test on its args isn't a significant difference. For those who disagree; we could also approach strings with a 'forgiveness is better then permission' attitude. The general rules could be: if no string args, everything works as normal. In case of string args, we may think of the effect of __getitem__ as indexing with strings replaced by colons first, and then creating a NamedAxisIndexExpression (NAIE), associating the given string label with each corresponding axis. Thus, we can write things like A[0:3,'i'] As some additional rules; string arguments can be 'expanded', the string is split on commas if present, and otherwise split into characters, which are then the axis labels. In expressions, all non-labeled axes are treated in sequential order, similar to the ... construct, and have standard numpy broadcasting semantics. The only problem with [] notation is field name lookup; though I have always felt that tables with named columns should be an ndarray subtype, given their fundamentally different indexing semantics. Realizing the full potential of such an approach would be a complex undertaking, but to start with, a more elegant interface to np.einsum would be rather easy to implement. On Tue, Mar 18, 2014 at 9:46 AM, Sebastian Haase <seb.ha...@gmail.com>wrote: > Just add one vote: I am for > * right association * > because 1) I'm thinking of matrix multiplication more like operators, > which I also learned to work from right to left and because 2) I would > put a vector to the right, which would result in better performance. > > I don't have an opinion on tight/same/ or weak.... (maybe that means > then 'same' because it's easier to remember !?) > > My two cents, > Sebastian Haase > > > On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn > <hoogendoorn.ee...@gmail.com> wrote: > > > > > > Perhaps this a bit of a thread hyjack; but this discussion got me > thinking > > about how to arrive > > > > at a more vectorized/tensorified way of specifying linear algebra > > operations, in an elegant manner. > > > > I probably got a little carried away, but what about this syntax? > > > > indexing/calling an ndarray with a string returns a TensorExpression > object > > these TensorExpression objects can be combined into a graph using > operator > > overloads > > and these graphs are translated to calls to BLAS or einsum, as is > > appropriate > > > > > > #declare some symbols > > i,j,ij,k = 'i','j','ij','k' > > #we may force evaluation of a (sub) TensorExpression by calling it > > #this is trivial to translate to call to einsum > > #but such special cases could be dispatched to BLAS as well > > b = (A(ij) * x(j)) (i) > > #alternatively, we can predeclare a LHS which is automatically sized > later > > #note that this translates into the same call as the above; just some > > syntactic sugar > > b = np.empty(()) > > b[i] = A(ij) * x(j) > > #more complex TensorExpression graphs of this form are trivial to > translate > > to a call to einsum as well > > a(i)*b(j)*c(k) > > #conceptually, there is no need to limit this scheme to multiplications > > only! > > #although such generalizations would require a more complex execution > engine > > #however, the revamped nditer should make this quite managable to > implement > > a(i)*b(j) + c(k) > > #if axes strings are omitted, standard numpy broadcasting rules are > applied > > to the expressiongraph created > > #this is identical to a*b+c; except that we have the opportunity to > > eliminate temporaries > > a()*b()+c() > > > > > > Note that such an approach kills quite some birds with one stone > > it allows for the elimination of temporaries along the lines of numexpr > > > > But if i could write: > > > > b[i] = A[ij] * x[j] > > I would much prefer that over > > b = A @ x > > even though the latter is shorter > > > > Now if i had n input and output vectors, it would be easy what to do with > > them: > > > > b[ni] = A[ij] * x[nj] > > > > As i argued earlier, I much prefer this form of explicitness over > > conventions about what constitutes a row or column vector. And > vectorization > > of linear algebra is a trivial extension in this manner, which in itself > is > > just a subset of even more general multilinear products, which themselves > > are a subset of more general expression involving things other than > products > > > > Its a somewhat ambitious idea, and there are probably reasons why it > isnt a > > good idea as well, but it does not require python language modifications, > > and it does not clash with any other functionality or syntax of numpy, as > > far as i can tell. Calling of arrays is not yet defined, and > alternatively > > array indexing could be overloaded on string type. > > > > Either way, something to chew on when deciding on the best way to go > > forward. > > > > > > > > > > > > On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust <daoust...@gmail.com> > wrote: > >> > >> On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith <n...@pobox.com> wrote: > >>> > >>> > >>> But, this is actually a feature! Because obviously what *should* be > >>> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ > >>> Mat). Both of those answers are terrible; it's just, if you have an > >>> ordinary left-/right-associative operator, those are your only > >>> options. What *should* be returned is an error. And in this scheme we > >>> get to see the whole @ expression at once, so we actually can raise an > >>> error for such things. > >> > >> > >> > >> Sorry if this is a little off topic. > >> > >> But there's still something about the "vector" examples that bugs me, > >> "matrix@vector" and "vector@@2", keep popping up (this also applies to > the > >> matrix@matrix examples to a lesser extent). > >> > >> I'm a little unconformable looking at the shape to to decide what's a > >> matrix and what's a vector. (Matlab has some problems like this) > >> > >> If it only has one or two dimensions it's easy, but I always find that > if > >> I've written code that works for 1 matrix or vector, 5 minutes later I > want > >> it to work for fields of matrices or vectors. If we're just going by > shape > >> there's no way to distinguish between a 2d field of matrices and a 3d > field > >> of vectors. > >> > >> I guess this is a repeat of part of what Eelco Hoogendoorn saying a few > >> posts back > >> > >> I was just wondering if anyone sees a place, to get @ a little closer to > >> Einsum, for some sort of array class that understands the difference > between > >> a 4D array of scalars, a 3D array of vectors, and a 2D array of > matrices... > >> The difference between the axes that broad-cast and the axes that can > sum > >> when you hit them with an @ ... or something like that. > >> > >> Just a thought. > >> > >> Einsum is fantastic by the way, totally worth learning and using. > >> > >> > >> > >> > >> Mark Daoust > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion@scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion