To elaborate a little on such a more general and explicit method of
specifying linear operations (perhaps 'expressions with named axes' is a
good nomer to cover this topic).

I think indexing rather than calling is preferable. I worried at first
about the performance overhead of checking for strings at every indexing
op, but get ndarray__getitem__ is already quite a complex beast anyway, and
adding (yet another) type test on its args isn't a significant difference.
For those who disagree; we could also approach strings with a 'forgiveness
is better then permission' attitude.

The general rules could be: if no string args, everything works as normal.
In case of string args, we may think of the effect of __getitem__ as
indexing with strings replaced by colons first, and then creating a
NamedAxisIndexExpression (NAIE), associating the given string label with
each corresponding axis. Thus, we can write things like A[0:3,'i']

As some additional rules; string arguments can be 'expanded', the string is
split on commas if present, and otherwise split into characters, which are
then the axis labels.

In expressions, all non-labeled axes are treated in sequential order,
similar to the ... construct, and have standard numpy broadcasting
semantics.

The only problem with [] notation is field name lookup; though I have
always felt that tables with named columns should be an ndarray subtype,
given their fundamentally different indexing semantics.

Realizing the full potential of such an approach would be a complex
undertaking, but to start with, a more elegant interface to np.einsum would
be rather easy to implement.


On Tue, Mar 18, 2014 at 9:46 AM, Sebastian Haase <seb.ha...@gmail.com>wrote:

> Just add one vote:  I am for
> * right association *
> because  1) I'm thinking of matrix multiplication more like operators,
> which I also learned to work from right to left and because 2) I would
> put a vector to the right, which would result in better performance.
>
> I don't have an opinion on tight/same/ or weak.... (maybe that means
> then 'same' because it's easier to remember !?)
>
> My two cents,
> Sebastian Haase
>
>
> On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn
> <hoogendoorn.ee...@gmail.com> wrote:
> >
> >
> > Perhaps this a bit of a thread hyjack; but this discussion got me
> thinking
> > about how to arrive
> >
> > at a more vectorized/tensorified way of specifying linear algebra
> > operations, in an elegant manner.
> >
> > I probably got a little carried away, but what about this syntax?
> >
> > indexing/calling an ndarray with a string returns a TensorExpression
> object
> > these TensorExpression objects can be combined into a graph using
> operator
> > overloads
> > and these graphs are translated to calls to BLAS or einsum, as is
> > appropriate
> >
> >
> > #declare some symbols
> > i,j,ij,k = 'i','j','ij','k'
> > #we may force evaluation of a (sub) TensorExpression by calling it
> > #this is trivial to translate to call to einsum
> > #but such special cases could be dispatched to BLAS as well
> > b = (A(ij) * x(j)) (i)
> > #alternatively, we can predeclare a LHS which is automatically sized
> later
> > #note that this translates into the same call as the above; just some
> > syntactic sugar
> > b = np.empty(())
> > b[i] = A(ij) * x(j)
> > #more complex TensorExpression graphs of this form are trivial to
> translate
> > to a call to einsum as well
> > a(i)*b(j)*c(k)
> > #conceptually, there is no need to limit this scheme to multiplications
> > only!
> > #although such generalizations would require a more complex execution
> engine
> > #however, the revamped nditer should make this quite managable to
> implement
> > a(i)*b(j) + c(k)
> > #if axes strings are omitted, standard numpy broadcasting rules are
> applied
> > to the expressiongraph created
> > #this is identical to a*b+c; except that we have the opportunity to
> > eliminate temporaries
> > a()*b()+c()
> >
> >
> > Note that such an approach kills quite some birds with one stone
> > it allows for the elimination of temporaries along the lines of numexpr
> >
> > But if i could write:
> >
> > b[i] = A[ij] * x[j]
> > I would much prefer that over
> > b = A @ x
> > even though the latter is shorter
> >
> > Now if i had n input and output vectors, it would be easy what to do with
> > them:
> >
> > b[ni] = A[ij] * x[nj]
> >
> > As i argued earlier, I much prefer this form of explicitness over
> > conventions about what constitutes a row or column vector. And
> vectorization
> > of linear algebra is a trivial extension in this manner, which in itself
> is
> > just a subset of even more general multilinear products, which themselves
> > are a subset of more general expression involving things other than
> products
> >
> > Its a somewhat ambitious idea, and there are probably reasons why it
> isnt a
> > good idea as well, but it does not require python language modifications,
> > and it does not clash with any other functionality or syntax of numpy, as
> > far as i can tell. Calling of arrays is not yet defined, and
> alternatively
> > array indexing could be overloaded on string type.
> >
> > Either way, something to chew on when deciding on the best way to go
> > forward.
> >
> >
> >
> >
> >
> > On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust <daoust...@gmail.com>
> wrote:
> >>
> >> On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith <n...@pobox.com> wrote:
> >>>
> >>>
> >>> But, this is actually a feature! Because obviously what *should* be
> >>> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @
> >>> Mat). Both of those answers are terrible; it's just, if you have an
> >>> ordinary left-/right-associative operator, those are your only
> >>> options. What *should* be returned is an error. And in this scheme we
> >>> get to see the whole @ expression at once, so we actually can raise an
> >>> error for such things.
> >>
> >>
> >>
> >> Sorry if this is a little off topic.
> >>
> >> But there's still something about the "vector" examples that bugs me,
> >> "matrix@vector" and "vector@@2", keep popping up (this also applies to
> the
> >> matrix@matrix examples to a lesser extent).
> >>
> >> I'm a little unconformable looking at the shape to to decide what's a
> >> matrix and what's a vector. (Matlab has some problems like this)
> >>
> >> If it only has one or two dimensions it's easy, but I always find that
> if
> >> I've written code that works for 1 matrix or vector, 5 minutes later I
> want
> >> it to work for fields of matrices or vectors. If we're just going by
> shape
> >> there's no way to distinguish between a 2d field of matrices and a 3d
> field
> >> of vectors.
> >>
> >> I guess this is a repeat of part of what Eelco Hoogendoorn saying a few
> >> posts back
> >>
> >> I was just wondering if anyone sees a place, to get @ a little closer to
> >> Einsum, for some sort of array class that understands the difference
> between
> >> a 4D array of scalars, a 3D array of vectors, and a 2D array of
> matrices...
> >> The difference between the axes that broad-cast and the axes that can
> sum
> >> when you hit them with an @ ... or something like that.
> >>
> >> Just a thought.
> >>
> >> Einsum is fantastic by the way, totally worth learning and using.
> >>
> >>
> >>
> >>
> >> Mark Daoust
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to