Re: [Numpy-discussion] @ operator

2014-09-12 Thread Charles R Harris
On Thu, Sep 11, 2014 at 11:09 PM, Nathaniel Smith  wrote:

> On Thu, Sep 11, 2014 at 11:18 PM, Charles R Harris
>  wrote:
> >
> > On Thu, Sep 11, 2014 at 8:49 PM, Nathaniel Smith  wrote:
> >>
> >> On Thu, Sep 11, 2014 at 10:12 PM, Charles R Harris
> >>  wrote:
> >> >
> >> > On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith 
> wrote:
> >> >>
> >> >> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
> >> >>  wrote:
> >> >> >
> >> >> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith 
> >> >> > wrote:
> >> >> >>
> >> >> >> My vote is:
> >> >> >>
> >> >> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all
> >> >> >> __op__
> >> >> >> methods do (so I guess check __array_priority__ or whatever it is
> we
> >> >> >> always do). I'd also be okay with ignoring __array_priority__ on
> the
> >> >> >> grounds that __numpy_ufunc__ is better, and there's no existing
> code
> >> >> >> relying on __array_priority__ support in __matmul__.
> >> >> >>
> >> >> >> Having decided that we are actually going to run, they dispatch
> >> >> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
> >> >> >> in-place version), similarly to how e.g. __add__ dispatches to
> >> >> >> np.add.
> >> >> >>
> >> >> >> newdot acts like a standard gufunc with all the standard niceties,
> >> >> >> including __numpy_ufunc__ dispatch.
> >> >> >>
> >> >> >> ("newdot" here is intended as a placeholder name, maybe it should
> be
> >> >> >> np.linalg.matmul or something else to be bikeshed later. I also
> vote
> >> >> >> that eventually 'dot' become an alias for this function, but
> whether
> >> >> >> to do that is an orthogonal discussion for later.)
> >> >> >>
> >> >> > If we went the ufunc route, I think we would want three of them,
> >> >> > matxvec,
> >> >> > vecxmat, and matxmat, because the best inner loops would be
> different
> >> >> > in
> >> >> > the
> >> >> > three cases,
> >> >>
> >> >> Couldn't we write a single inner loop like:
> >> >>
> >> >> void ufunc_loop(blah blah) {
> >> >> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
> >> >> call DOT
> >> >> } else if (arg2_shape[0] == 1) {
> >> >> call GEMV
> >> >> } else if (...) {
> >> >> ...
> >> >> } else {
> >> >> call GEMM
> >> >> }
> >> >> }
> >> >> ?
> >> >
> >> > Not for generalized ufuncs, different signatures, or if linearized,
> more
> >> > info on dimensions.
> >>
> >> This sentence no verb, but I think the point you might be raising is:
> >> we don't actually have the technical capability to define a single
> >> gufunc for @, because the matmat, matvec, vecmat, and vecvec forms
> >> have different gufunc signatures ("mn,nk->mk", "mn,n->m", "n,nk->k",
> >> and "n,n->" respectively, I think)?
> >>
> >> This is true, but Jaime has said he's willing to look at fixing this
> :-):
> >>
> >>
> http://thread.gmane.org/gmane.comp.python.numeric.general/58669/focus=58670
> >>
> >
> > Don't see the need here, the loops are not complicated.
> >
> >>
> >> ...and fundamentally, it's very difficult to solve this anywhere else
> >> except the ufunc internals. When we first enter a function like
> >> newdot, we need to check for special overloads like __numpy_ufunc__
> >> *before* we do any array_like->ndarray coercion. But we have to do
> >> array_like->ndarray coercion before we know what the shape/ndim of the
> >> our inputs is.
> >
> >
> > True.
> >
> >>
> >> And in the wrapper-around-multiple-ufuncs approach, we
> >> have to check the shape/ndim before we choose which ufunc to dispatch
> >> to.
> >
> >
> > True. I don't see a problem there and the four ufuncs would be useful in
> > themselves. I think they should be part of the multiarray module methods
> if
> > we go that way.
>
> Not sure what you mean about multiarray module methods -- it's kinda
> tricky to expose ufuncs from multiarray.so isn't it, b/c of the split
> between multiarray.so and umath.so?
>

Using the umath c-api is no worse than using the other c-api's. Using
ufuncs from umath is trickier, IIRC multiarray initializes the ops that use
them on load, but I'd define the new functions in multiarray, not umath.
Generating the generalized functions can take place either during module
load, or by using the static variable pattern on first function call, i.e.,

static ufunc myfunc = NULL;
if (myfunc == NULL) {
 myfunc = make_ufunc(...);
}

I tend towards the latter as I want to use the functions in the multiarray
c-api. The umath module uses the former.

The vector dot functions are already available in the descr and can be
passed to a generic loop (overhead), but what I would like to do at some
point is bring all the loops together in their own file in multiarray.

>
> >> So __numpy_ufunc__ won't get checked until after we've already
> >> coerced to ndarray, oops... OTOH the ufunc internals already know how
> >> to do this dance, so it's at least straightforward (if not necessarily
> >> trivial) to insert the correct logic once in the correct p

Re: [Numpy-discussion] @ operator

2014-09-11 Thread Nathaniel Smith
On Thu, Sep 11, 2014 at 11:18 PM, Charles R Harris
 wrote:
>
> On Thu, Sep 11, 2014 at 8:49 PM, Nathaniel Smith  wrote:
>>
>> On Thu, Sep 11, 2014 at 10:12 PM, Charles R Harris
>>  wrote:
>> >
>> > On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith  wrote:
>> >>
>> >> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
>> >>  wrote:
>> >> >
>> >> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith 
>> >> > wrote:
>> >> >>
>> >> >> My vote is:
>> >> >>
>> >> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all
>> >> >> __op__
>> >> >> methods do (so I guess check __array_priority__ or whatever it is we
>> >> >> always do). I'd also be okay with ignoring __array_priority__ on the
>> >> >> grounds that __numpy_ufunc__ is better, and there's no existing code
>> >> >> relying on __array_priority__ support in __matmul__.
>> >> >>
>> >> >> Having decided that we are actually going to run, they dispatch
>> >> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
>> >> >> in-place version), similarly to how e.g. __add__ dispatches to
>> >> >> np.add.
>> >> >>
>> >> >> newdot acts like a standard gufunc with all the standard niceties,
>> >> >> including __numpy_ufunc__ dispatch.
>> >> >>
>> >> >> ("newdot" here is intended as a placeholder name, maybe it should be
>> >> >> np.linalg.matmul or something else to be bikeshed later. I also vote
>> >> >> that eventually 'dot' become an alias for this function, but whether
>> >> >> to do that is an orthogonal discussion for later.)
>> >> >>
>> >> > If we went the ufunc route, I think we would want three of them,
>> >> > matxvec,
>> >> > vecxmat, and matxmat, because the best inner loops would be different
>> >> > in
>> >> > the
>> >> > three cases,
>> >>
>> >> Couldn't we write a single inner loop like:
>> >>
>> >> void ufunc_loop(blah blah) {
>> >> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
>> >> call DOT
>> >> } else if (arg2_shape[0] == 1) {
>> >> call GEMV
>> >> } else if (...) {
>> >> ...
>> >> } else {
>> >> call GEMM
>> >> }
>> >> }
>> >> ?
>> >
>> > Not for generalized ufuncs, different signatures, or if linearized, more
>> > info on dimensions.
>>
>> This sentence no verb, but I think the point you might be raising is:
>> we don't actually have the technical capability to define a single
>> gufunc for @, because the matmat, matvec, vecmat, and vecvec forms
>> have different gufunc signatures ("mn,nk->mk", "mn,n->m", "n,nk->k",
>> and "n,n->" respectively, I think)?
>>
>> This is true, but Jaime has said he's willing to look at fixing this :-):
>>
>> http://thread.gmane.org/gmane.comp.python.numeric.general/58669/focus=58670
>>
>
> Don't see the need here, the loops are not complicated.
>
>>
>> ...and fundamentally, it's very difficult to solve this anywhere else
>> except the ufunc internals. When we first enter a function like
>> newdot, we need to check for special overloads like __numpy_ufunc__
>> *before* we do any array_like->ndarray coercion. But we have to do
>> array_like->ndarray coercion before we know what the shape/ndim of the
>> our inputs is.
>
>
> True.
>
>>
>> And in the wrapper-around-multiple-ufuncs approach, we
>> have to check the shape/ndim before we choose which ufunc to dispatch
>> to.
>
>
> True. I don't see a problem there and the four ufuncs would be useful in
> themselves. I think they should be part of the multiarray module methods if
> we go that way.

Not sure what you mean about multiarray module methods -- it's kinda
tricky to expose ufuncs from multiarray.so isn't it, b/c of the split
between multiarray.so and umath.so?

>> So __numpy_ufunc__ won't get checked until after we've already
>> coerced to ndarray, oops... OTOH the ufunc internals already know how
>> to do this dance, so it's at least straightforward (if not necessarily
>> trivial) to insert the correct logic once in the correct place.
>
>
> I'm  thinking that the four ufuncs being overridden by user subtypes should
> be sufficient.

Maybe I don't understand what you're proposing. Suppose we get handed
a random 3rd party type that has a __numpy_ufunc__ attribute, but we
know nothing else about it. What do we do? Pick one of the 4 ufuncs at
random and call it?

>>
>>
>> > What you show is essentially what dot does now for cblas
>> > enabled functions. But note, we need more than the simple '@', we also
>> > need
>> > stacks of vectors, and turning vectors into matrices, and then back into
>> > vectors seems unnecessarily complicated.
>>
>> By the time we get into the data/shape/strides world that ufuncs work
>> with, converting vectors->matrices is literally just adding a single
>> entry to the shape+strides arrays. This feels trivial and cheap to me?
>
>
> And moving things around, then removing. It's an option if we just want to
> use matrix multiplication for everything. I don't think there is any speed
> advantage one way or the other, although there are currently size
> limitations tha

Re: [Numpy-discussion] @ operator

2014-09-11 Thread Charles R Harris
On Thu, Sep 11, 2014 at 8:49 PM, Nathaniel Smith  wrote:

> On Thu, Sep 11, 2014 at 10:12 PM, Charles R Harris
>  wrote:
> >
> > On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith  wrote:
> >>
> >> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
> >>  wrote:
> >> >
> >> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith 
> wrote:
> >> >>
> >> >> My vote is:
> >> >>
> >> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
> >> >> methods do (so I guess check __array_priority__ or whatever it is we
> >> >> always do). I'd also be okay with ignoring __array_priority__ on the
> >> >> grounds that __numpy_ufunc__ is better, and there's no existing code
> >> >> relying on __array_priority__ support in __matmul__.
> >> >>
> >> >> Having decided that we are actually going to run, they dispatch
> >> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
> >> >> in-place version), similarly to how e.g. __add__ dispatches to
> np.add.
> >> >>
> >> >> newdot acts like a standard gufunc with all the standard niceties,
> >> >> including __numpy_ufunc__ dispatch.
> >> >>
> >> >> ("newdot" here is intended as a placeholder name, maybe it should be
> >> >> np.linalg.matmul or something else to be bikeshed later. I also vote
> >> >> that eventually 'dot' become an alias for this function, but whether
> >> >> to do that is an orthogonal discussion for later.)
> >> >>
> >> > If we went the ufunc route, I think we would want three of them,
> >> > matxvec,
> >> > vecxmat, and matxmat, because the best inner loops would be different
> in
> >> > the
> >> > three cases,
> >>
> >> Couldn't we write a single inner loop like:
> >>
> >> void ufunc_loop(blah blah) {
> >> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
> >> call DOT
> >> } else if (arg2_shape[0] == 1) {
> >> call GEMV
> >> } else if (...) {
> >> ...
> >> } else {
> >> call GEMM
> >> }
> >> }
> >> ?
> >
> > Not for generalized ufuncs, different signatures, or if linearized, more
> > info on dimensions.
>
> This sentence no verb, but I think the point you might be raising is:
> we don't actually have the technical capability to define a single
> gufunc for @, because the matmat, matvec, vecmat, and vecvec forms
> have different gufunc signatures ("mn,nk->mk", "mn,n->m", "n,nk->k",
> and "n,n->" respectively, I think)?
>
> This is true, but Jaime has said he's willing to look at fixing this :-):
>
> http://thread.gmane.org/gmane.comp.python.numeric.general/58669/focus=58670
>
>
Don't see the need here, the loops are not complicated.


> ...and fundamentally, it's very difficult to solve this anywhere else
> except the ufunc internals. When we first enter a function like
> newdot, we need to check for special overloads like __numpy_ufunc__
> *before* we do any array_like->ndarray coercion. But we have to do
> array_like->ndarray coercion before we know what the shape/ndim of the
> our inputs is.


True.


> And in the wrapper-around-multiple-ufuncs approach, we
> have to check the shape/ndim before we choose which ufunc to dispatch
> to.


True. I don't see a problem there and the four ufuncs would be useful in
themselves. I think they should be part of the multiarray module methods if
we go that way.

So __numpy_ufunc__ won't get checked until after we've already
> coerced to ndarray, oops... OTOH the ufunc internals already know how
> to do this dance, so it's at least straightforward (if not necessarily
> trivial) to insert the correct logic once in the correct place.
>

I'm  thinking that the four ufuncs being overridden by user subtypes should
be sufficient.


>
> > What you show is essentially what dot does now for cblas
> > enabled functions. But note, we need more than the simple '@', we also
> need
> > stacks of vectors, and turning vectors into matrices, and then back into
> > vectors seems unnecessarily complicated.
>
> By the time we get into the data/shape/strides world that ufuncs work
> with, converting vectors->matrices is literally just adding a single
> entry to the shape+strides arrays. This feels trivial and cheap to me?
>

And moving things around, then removing. It's an option if we just want to
use matrix multiplication for everything. I don't think there is any speed
advantage one way or the other, although there are currently size
limitations that are easier to block in the 1D case than the matrix case.
In practice 2**31 per dimension for floating point is probably plenty.


>
> Or do you just mean that we do actually want broadcasting matvec,
> vecmat, vecvec gufuncs? I agree with this too but it seems orthogonal
> to me -- we can have those in any case, even if newdot doesn't
> literally call them.
>

Yes. But if we have them, we might as well use them.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-11 Thread Nathaniel Smith
On Thu, Sep 11, 2014 at 10:12 PM, Charles R Harris
 wrote:
>
> On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith  wrote:
>>
>> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
>>  wrote:
>> >
>> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:
>> >>
>> >> My vote is:
>> >>
>> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
>> >> methods do (so I guess check __array_priority__ or whatever it is we
>> >> always do). I'd also be okay with ignoring __array_priority__ on the
>> >> grounds that __numpy_ufunc__ is better, and there's no existing code
>> >> relying on __array_priority__ support in __matmul__.
>> >>
>> >> Having decided that we are actually going to run, they dispatch
>> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
>> >> in-place version), similarly to how e.g. __add__ dispatches to np.add.
>> >>
>> >> newdot acts like a standard gufunc with all the standard niceties,
>> >> including __numpy_ufunc__ dispatch.
>> >>
>> >> ("newdot" here is intended as a placeholder name, maybe it should be
>> >> np.linalg.matmul or something else to be bikeshed later. I also vote
>> >> that eventually 'dot' become an alias for this function, but whether
>> >> to do that is an orthogonal discussion for later.)
>> >>
>> > If we went the ufunc route, I think we would want three of them,
>> > matxvec,
>> > vecxmat, and matxmat, because the best inner loops would be different in
>> > the
>> > three cases,
>>
>> Couldn't we write a single inner loop like:
>>
>> void ufunc_loop(blah blah) {
>> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
>> call DOT
>> } else if (arg2_shape[0] == 1) {
>> call GEMV
>> } else if (...) {
>> ...
>> } else {
>> call GEMM
>> }
>> }
>> ?
>
> Not for generalized ufuncs, different signatures, or if linearized, more
> info on dimensions.

This sentence no verb, but I think the point you might be raising is:
we don't actually have the technical capability to define a single
gufunc for @, because the matmat, matvec, vecmat, and vecvec forms
have different gufunc signatures ("mn,nk->mk", "mn,n->m", "n,nk->k",
and "n,n->" respectively, I think)?

This is true, but Jaime has said he's willing to look at fixing this :-):
http://thread.gmane.org/gmane.comp.python.numeric.general/58669/focus=58670

...and fundamentally, it's very difficult to solve this anywhere else
except the ufunc internals. When we first enter a function like
newdot, we need to check for special overloads like __numpy_ufunc__
*before* we do any array_like->ndarray coercion. But we have to do
array_like->ndarray coercion before we know what the shape/ndim of the
our inputs is. And in the wrapper-around-multiple-ufuncs approach, we
have to check the shape/ndim before we choose which ufunc to dispatch
to. So __numpy_ufunc__ won't get checked until after we've already
coerced to ndarray, oops... OTOH the ufunc internals already know how
to do this dance, so it's at least straightforward (if not necessarily
trivial) to insert the correct logic once in the correct place.

> What you show is essentially what dot does now for cblas
> enabled functions. But note, we need more than the simple '@', we also need
> stacks of vectors, and turning vectors into matrices, and then back into
> vectors seems unnecessarily complicated.

By the time we get into the data/shape/strides world that ufuncs work
with, converting vectors->matrices is literally just adding a single
entry to the shape+strides arrays. This feels trivial and cheap to me?

Or do you just mean that we do actually want broadcasting matvec,
vecmat, vecvec gufuncs? I agree with this too but it seems orthogonal
to me -- we can have those in any case, even if newdot doesn't
literally call them.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-11 Thread Charles R Harris
On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith  wrote:

> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
>  wrote:
> >
> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:
> >>
> >> My vote is:
> >>
> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
> >> methods do (so I guess check __array_priority__ or whatever it is we
> >> always do). I'd also be okay with ignoring __array_priority__ on the
> >> grounds that __numpy_ufunc__ is better, and there's no existing code
> >> relying on __array_priority__ support in __matmul__.
> >>
> >> Having decided that we are actually going to run, they dispatch
> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
> >> in-place version), similarly to how e.g. __add__ dispatches to np.add.
> >>
> >> newdot acts like a standard gufunc with all the standard niceties,
> >> including __numpy_ufunc__ dispatch.
> >>
> >> ("newdot" here is intended as a placeholder name, maybe it should be
> >> np.linalg.matmul or something else to be bikeshed later. I also vote
> >> that eventually 'dot' become an alias for this function, but whether
> >> to do that is an orthogonal discussion for later.)
> >>
> > If we went the ufunc route, I think we would want three of them, matxvec,
> > vecxmat, and matxmat, because the best inner loops would be different in
> the
> > three cases,
>
> Couldn't we write a single inner loop like:
>
> void ufunc_loop(blah blah) {
> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
> call DOT
> } else if (arg2_shape[0] == 1) {
> call GEMV
> } else if (...) {
> ...
> } else {
> call GEMM
> }
> }
> ?
>
>
Not for generalized ufuncs, different signatures, or if linearized, more
info on dimensions. What you show is essentially what dot does now for
cblas enabled functions. But note, we need more than the simple '@', we
also need stacks of vectors, and turning vectors into matrices, and then
back into vectors seems unnecessarily complicated.


> I can't see any reason that this would be measureably slower than
> having multiple ufunc loops: the checks are extremely cheap, will
> usually only be done once (because most @ calls will only enter the
> inner loop once), and if they are done repeatedly in the same call
> then they'll come out the same way very time and thus be predictable
> branches.
>

It's not the loops that are expensive.


>
> > but they couldn't be straight ufuncs themselves, as we don't
> > need the other options, `reduce`, etc.,
>
> Not sure what you mean -- ATM gufuncs don't support reduce anyway, but
> if someone felt like implementing it then it would be cool to get
> dot.reduce for free -- it is a useful/meaningful operation. Is there
> some reason that supporting more ufunc features is bad?
>
> > but they can't be exactly like the
> > linalg machinery, because we do want subclasses to be able to override.
>
> Subclasses can override ufuncs using __numpy_ufunc__.
>

Yep.


>
> > Hmm...
> >
> > The ufunc machinery has some funky aspects. For instance, there are
> > hardwired checks for `__radd__` and other such operators in
> > PyUFunc_GenericFunction  that allows subclasses to overide the ufunc.
> Those
> > options should really be part of the PyUFuncObject.
>
> Are you mentioning this b/c it's an annoying thing that talking about
> ufuncs reminded you of, or is there a specific impact on __matmul__
> that you see?
>

Just because it seems out of place. numpy_ufunc is definitely a better way
to do this.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-11 Thread Charles R Harris
On Thu, Sep 11, 2014 at 7:47 PM, Charles R Harris  wrote:

>
>
> On Thu, Sep 11, 2014 at 10:10 AM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:
>>
>>> On Wed, Sep 10, 2014 at 4:53 PM, Charles R Harris
>>>  wrote:
>>> >
>>> > On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:
>>> >>
>>> >> 09.09.2014, 22:52, Charles R Harris kirjoitti:
>>> >> >   1. Should the operator accept array_like for one of the arguments?
>>> >> >   2. Does it need to handle __numpy_ufunc__, or will
>>> >> >   __array_priority__ serve?
>>> >>
>>> >> I think the __matmul__ operator implementation should follow that of
>>> >> __mul__.
>>> >>
>>> >> [clip]
>>> >> >3. Do we want PyArray_Matmul in the numpy API?
>>> >> >4. Should a matmul function be supplied by the multiarray module?
>>> >> >
>>> >> > If 3 and 4 are wanted, should they use the __numpy_ufunc__
>>> machinery, or
>>> >> > will __array_priority__ serve?
>>> >>
>>> >> dot() function deals with __numpy_ufunc__, and the matmul() function
>>> >> should behave similarly.
>>> >>
>>> >> It seems dot() uses __array_priority__ for selection of output return
>>> >> subclass, so matmul() probably needs do the same thing.
>>> >>
>>> >> > Note that the type number operators, __add__ and such, currently use
>>> >> > __numpy_ufunc__ in combination with __array_priority__, this in
>>> addition
>>> >> > to
>>> >> > the fact that they are by default using ufuncs that do the same. I'd
>>> >> > rather
>>> >> > that the __*__ operators simply rely on __array_priority__.
>>> >>
>>> >> The whole business of __array_priority__ and __numpy_ufunc__ in the
>>> >> binary ops is solely about when  should yield the execution to
>>> >> __r__ of the other object.
>>> >>
>>> >> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
>>> >>
>>> >> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__
>>> before
>>> >> __numpy_ufunc__, except if array_priority happens to be smaller than
>>> >> that of the other class and your class is not an ndarray subclass".
>>> >>
>>> >> The following binops also do not IIRC respect __array_priority__ in
>>> >> preferring right-hand operand:
>>> >>
>>> >> - in-place operations
>>> >> - comparisons
>>> >>
>>> >> One question here is whether it's possible to change the behavior of
>>> >> __array_priority__ here at all, or whether changes are possible only
>>> in
>>> >> the context of adding new attributes telling Numpy what to do.
>>> >
>>> > I was tempted to make it a generalized ufunc, which would take care of
>>> a lot
>>> > of things, but there is a lot of overhead in those functions. Sounds
>>> like
>>> > the easiest thing is to make it similar to dot, although having an
>>> inplace
>>> > versions complicates the type selection a bit.
>>>
>>> Can we please fix the overhead instead of adding more half-complete
>>> implementations of the same concepts? I feel like this usually ends up
>>> slowing things down in the end, as optimization efforts get divided...
>>>
>>> My vote is:
>>>
>>> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
>>> methods do (so I guess check __array_priority__ or whatever it is we
>>> always do). I'd also be okay with ignoring __array_priority__ on the
>>> grounds that __numpy_ufunc__ is better, and there's no existing code
>>> relying on __array_priority__ support in __matmul__.
>>>
>>> Having decided that we are actually going to run, they dispatch
>>> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
>>> in-place version), similarly to how e.g. __add__ dispatches to np.add.
>>>
>>> newdot acts like a standard gufunc with all the standard niceties,
>>> including __numpy_ufunc__ dispatch.
>>>
>>> ("newdot" here is intended as a placeholder name, maybe it should be
>>> np.linalg.matmul or something else to be bikeshed later. I also vote
>>> that eventually 'dot' become an alias for this function, but whether
>>> to do that is an orthogonal discussion for later.)
>>>
>>> If we went the ufunc route, I think we would want three of them,
>> matxvec, vecxmat, and matxmat, because the best inner loops would be
>> different in the three cases, but they couldn't be straight ufuncs
>> themselves, as we don't need the other options, `reduce`, etc., but they
>> can't be exactly like the linalg machinery, because we do want subclasses
>> to be able to override. Hmm...
>>
>> The ufunc machinery has some funky aspects. For instance, there are
>> hardwired checks for `__radd__` and other such operators in
>> PyUFunc_GenericFunction  that allows subclasses to overide the ufunc. Those
>> options should really be part of the PyUFuncObject.
>>
>
> Here are some timing results for the inner product of an integer array
> comparing inner, ufunc (inner1d), and einsum implementations. The np.long
> type was chosen because it is implemented for inner1d and to avoid the
> effect of using cblas.
>
> In [43]: a = one

Re: [Numpy-discussion] @ operator

2014-09-11 Thread Nathaniel Smith
On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris
 wrote:
>
> On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:
>>
>> My vote is:
>>
>> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
>> methods do (so I guess check __array_priority__ or whatever it is we
>> always do). I'd also be okay with ignoring __array_priority__ on the
>> grounds that __numpy_ufunc__ is better, and there's no existing code
>> relying on __array_priority__ support in __matmul__.
>>
>> Having decided that we are actually going to run, they dispatch
>> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
>> in-place version), similarly to how e.g. __add__ dispatches to np.add.
>>
>> newdot acts like a standard gufunc with all the standard niceties,
>> including __numpy_ufunc__ dispatch.
>>
>> ("newdot" here is intended as a placeholder name, maybe it should be
>> np.linalg.matmul or something else to be bikeshed later. I also vote
>> that eventually 'dot' become an alias for this function, but whether
>> to do that is an orthogonal discussion for later.)
>>
> If we went the ufunc route, I think we would want three of them, matxvec,
> vecxmat, and matxmat, because the best inner loops would be different in the
> three cases,

Couldn't we write a single inner loop like:

void ufunc_loop(blah blah) {
if (arg1_shape[0] == 1 && arg2_shape[1] == 1) {
call DOT
} else if (arg2_shape[0] == 1) {
call GEMV
} else if (...) {
...
} else {
call GEMM
}
}
?

I can't see any reason that this would be measureably slower than
having multiple ufunc loops: the checks are extremely cheap, will
usually only be done once (because most @ calls will only enter the
inner loop once), and if they are done repeatedly in the same call
then they'll come out the same way very time and thus be predictable
branches.

> but they couldn't be straight ufuncs themselves, as we don't
> need the other options, `reduce`, etc.,

Not sure what you mean -- ATM gufuncs don't support reduce anyway, but
if someone felt like implementing it then it would be cool to get
dot.reduce for free -- it is a useful/meaningful operation. Is there
some reason that supporting more ufunc features is bad?

> but they can't be exactly like the
> linalg machinery, because we do want subclasses to be able to override.

Subclasses can override ufuncs using __numpy_ufunc__.

> Hmm...
>
> The ufunc machinery has some funky aspects. For instance, there are
> hardwired checks for `__radd__` and other such operators in
> PyUFunc_GenericFunction  that allows subclasses to overide the ufunc. Those
> options should really be part of the PyUFuncObject.

Are you mentioning this b/c it's an annoying thing that talking about
ufuncs reminded you of, or is there a specific impact on __matmul__
that you see?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-11 Thread Charles R Harris
On Thu, Sep 11, 2014 at 10:10 AM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:
>
>> On Wed, Sep 10, 2014 at 4:53 PM, Charles R Harris
>>  wrote:
>> >
>> > On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:
>> >>
>> >> 09.09.2014, 22:52, Charles R Harris kirjoitti:
>> >> >   1. Should the operator accept array_like for one of the arguments?
>> >> >   2. Does it need to handle __numpy_ufunc__, or will
>> >> >   __array_priority__ serve?
>> >>
>> >> I think the __matmul__ operator implementation should follow that of
>> >> __mul__.
>> >>
>> >> [clip]
>> >> >3. Do we want PyArray_Matmul in the numpy API?
>> >> >4. Should a matmul function be supplied by the multiarray module?
>> >> >
>> >> > If 3 and 4 are wanted, should they use the __numpy_ufunc__
>> machinery, or
>> >> > will __array_priority__ serve?
>> >>
>> >> dot() function deals with __numpy_ufunc__, and the matmul() function
>> >> should behave similarly.
>> >>
>> >> It seems dot() uses __array_priority__ for selection of output return
>> >> subclass, so matmul() probably needs do the same thing.
>> >>
>> >> > Note that the type number operators, __add__ and such, currently use
>> >> > __numpy_ufunc__ in combination with __array_priority__, this in
>> addition
>> >> > to
>> >> > the fact that they are by default using ufuncs that do the same. I'd
>> >> > rather
>> >> > that the __*__ operators simply rely on __array_priority__.
>> >>
>> >> The whole business of __array_priority__ and __numpy_ufunc__ in the
>> >> binary ops is solely about when  should yield the execution to
>> >> __r__ of the other object.
>> >>
>> >> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
>> >>
>> >> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__
>> before
>> >> __numpy_ufunc__, except if array_priority happens to be smaller than
>> >> that of the other class and your class is not an ndarray subclass".
>> >>
>> >> The following binops also do not IIRC respect __array_priority__ in
>> >> preferring right-hand operand:
>> >>
>> >> - in-place operations
>> >> - comparisons
>> >>
>> >> One question here is whether it's possible to change the behavior of
>> >> __array_priority__ here at all, or whether changes are possible only in
>> >> the context of adding new attributes telling Numpy what to do.
>> >
>> > I was tempted to make it a generalized ufunc, which would take care of
>> a lot
>> > of things, but there is a lot of overhead in those functions. Sounds
>> like
>> > the easiest thing is to make it similar to dot, although having an
>> inplace
>> > versions complicates the type selection a bit.
>>
>> Can we please fix the overhead instead of adding more half-complete
>> implementations of the same concepts? I feel like this usually ends up
>> slowing things down in the end, as optimization efforts get divided...
>>
>> My vote is:
>>
>> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
>> methods do (so I guess check __array_priority__ or whatever it is we
>> always do). I'd also be okay with ignoring __array_priority__ on the
>> grounds that __numpy_ufunc__ is better, and there's no existing code
>> relying on __array_priority__ support in __matmul__.
>>
>> Having decided that we are actually going to run, they dispatch
>> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
>> in-place version), similarly to how e.g. __add__ dispatches to np.add.
>>
>> newdot acts like a standard gufunc with all the standard niceties,
>> including __numpy_ufunc__ dispatch.
>>
>> ("newdot" here is intended as a placeholder name, maybe it should be
>> np.linalg.matmul or something else to be bikeshed later. I also vote
>> that eventually 'dot' become an alias for this function, but whether
>> to do that is an orthogonal discussion for later.)
>>
>> If we went the ufunc route, I think we would want three of them, matxvec,
> vecxmat, and matxmat, because the best inner loops would be different in
> the three cases, but they couldn't be straight ufuncs themselves, as we
> don't need the other options, `reduce`, etc., but they can't be exactly
> like the linalg machinery, because we do want subclasses to be able to
> override. Hmm...
>
> The ufunc machinery has some funky aspects. For instance, there are
> hardwired checks for `__radd__` and other such operators in
> PyUFunc_GenericFunction  that allows subclasses to overide the ufunc. Those
> options should really be part of the PyUFuncObject.
>

Here are some timing results for the inner product of an integer array
comparing inner, ufunc (inner1d), and einsum implementations. The np.long
type was chosen because it is implemented for inner1d and to avoid the
effect of using cblas.

In [43]: a = ones(10, dtype=np.long)

In [44]: timeit inner(a,a)
1 loops, best of 3: 55.5 µs per loop

In [45]: timeit inner1d(a,a)
1 loops, best of 3: 56.2 µs per loop

In [46]: timeit einsum

Re: [Numpy-discussion] @ operator

2014-09-11 Thread Charles R Harris
On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith  wrote:

> On Wed, Sep 10, 2014 at 4:53 PM, Charles R Harris
>  wrote:
> >
> > On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:
> >>
> >> 09.09.2014, 22:52, Charles R Harris kirjoitti:
> >> >   1. Should the operator accept array_like for one of the arguments?
> >> >   2. Does it need to handle __numpy_ufunc__, or will
> >> >   __array_priority__ serve?
> >>
> >> I think the __matmul__ operator implementation should follow that of
> >> __mul__.
> >>
> >> [clip]
> >> >3. Do we want PyArray_Matmul in the numpy API?
> >> >4. Should a matmul function be supplied by the multiarray module?
> >> >
> >> > If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery,
> or
> >> > will __array_priority__ serve?
> >>
> >> dot() function deals with __numpy_ufunc__, and the matmul() function
> >> should behave similarly.
> >>
> >> It seems dot() uses __array_priority__ for selection of output return
> >> subclass, so matmul() probably needs do the same thing.
> >>
> >> > Note that the type number operators, __add__ and such, currently use
> >> > __numpy_ufunc__ in combination with __array_priority__, this in
> addition
> >> > to
> >> > the fact that they are by default using ufuncs that do the same. I'd
> >> > rather
> >> > that the __*__ operators simply rely on __array_priority__.
> >>
> >> The whole business of __array_priority__ and __numpy_ufunc__ in the
> >> binary ops is solely about when  should yield the execution to
> >> __r__ of the other object.
> >>
> >> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
> >>
> >> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__ before
> >> __numpy_ufunc__, except if array_priority happens to be smaller than
> >> that of the other class and your class is not an ndarray subclass".
> >>
> >> The following binops also do not IIRC respect __array_priority__ in
> >> preferring right-hand operand:
> >>
> >> - in-place operations
> >> - comparisons
> >>
> >> One question here is whether it's possible to change the behavior of
> >> __array_priority__ here at all, or whether changes are possible only in
> >> the context of adding new attributes telling Numpy what to do.
> >
> > I was tempted to make it a generalized ufunc, which would take care of a
> lot
> > of things, but there is a lot of overhead in those functions. Sounds like
> > the easiest thing is to make it similar to dot, although having an
> inplace
> > versions complicates the type selection a bit.
>
> Can we please fix the overhead instead of adding more half-complete
> implementations of the same concepts? I feel like this usually ends up
> slowing things down in the end, as optimization efforts get divided...
>
> My vote is:
>
> __matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
> methods do (so I guess check __array_priority__ or whatever it is we
> always do). I'd also be okay with ignoring __array_priority__ on the
> grounds that __numpy_ufunc__ is better, and there's no existing code
> relying on __array_priority__ support in __matmul__.
>
> Having decided that we are actually going to run, they dispatch
> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
> in-place version), similarly to how e.g. __add__ dispatches to np.add.
>
> newdot acts like a standard gufunc with all the standard niceties,
> including __numpy_ufunc__ dispatch.
>
> ("newdot" here is intended as a placeholder name, maybe it should be
> np.linalg.matmul or something else to be bikeshed later. I also vote
> that eventually 'dot' become an alias for this function, but whether
> to do that is an orthogonal discussion for later.)
>
> If we went the ufunc route, I think we would want three of them, matxvec,
vecxmat, and matxmat, because the best inner loops would be different in
the three cases, but they couldn't be straight ufuncs themselves, as we
don't need the other options, `reduce`, etc., but they can't be exactly
like the linalg machinery, because we do want subclasses to be able to
override. Hmm...

The ufunc machinery has some funky aspects. For instance, there are
hardwired checks for `__radd__` and other such operators in
PyUFunc_GenericFunction  that allows subclasses to overide the ufunc. Those
options should really be part of the PyUFuncObject.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Nathaniel Smith
On Wed, Sep 10, 2014 at 4:53 PM, Charles R Harris
 wrote:
>
> On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:
>>
>> 09.09.2014, 22:52, Charles R Harris kirjoitti:
>> >   1. Should the operator accept array_like for one of the arguments?
>> >   2. Does it need to handle __numpy_ufunc__, or will
>> >   __array_priority__ serve?
>>
>> I think the __matmul__ operator implementation should follow that of
>> __mul__.
>>
>> [clip]
>> >3. Do we want PyArray_Matmul in the numpy API?
>> >4. Should a matmul function be supplied by the multiarray module?
>> >
>> > If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
>> > will __array_priority__ serve?
>>
>> dot() function deals with __numpy_ufunc__, and the matmul() function
>> should behave similarly.
>>
>> It seems dot() uses __array_priority__ for selection of output return
>> subclass, so matmul() probably needs do the same thing.
>>
>> > Note that the type number operators, __add__ and such, currently use
>> > __numpy_ufunc__ in combination with __array_priority__, this in addition
>> > to
>> > the fact that they are by default using ufuncs that do the same. I'd
>> > rather
>> > that the __*__ operators simply rely on __array_priority__.
>>
>> The whole business of __array_priority__ and __numpy_ufunc__ in the
>> binary ops is solely about when  should yield the execution to
>> __r__ of the other object.
>>
>> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
>>
>> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__ before
>> __numpy_ufunc__, except if array_priority happens to be smaller than
>> that of the other class and your class is not an ndarray subclass".
>>
>> The following binops also do not IIRC respect __array_priority__ in
>> preferring right-hand operand:
>>
>> - in-place operations
>> - comparisons
>>
>> One question here is whether it's possible to change the behavior of
>> __array_priority__ here at all, or whether changes are possible only in
>> the context of adding new attributes telling Numpy what to do.
>
> I was tempted to make it a generalized ufunc, which would take care of a lot
> of things, but there is a lot of overhead in those functions. Sounds like
> the easiest thing is to make it similar to dot, although having an inplace
> versions complicates the type selection a bit.

Can we please fix the overhead instead of adding more half-complete
implementations of the same concepts? I feel like this usually ends up
slowing things down in the end, as optimization efforts get divided...

My vote is:

__matmul__/__rmatmul__ do the standard dispatch stuff that all __op__
methods do (so I guess check __array_priority__ or whatever it is we
always do). I'd also be okay with ignoring __array_priority__ on the
grounds that __numpy_ufunc__ is better, and there's no existing code
relying on __array_priority__ support in __matmul__.

Having decided that we are actually going to run, they dispatch
unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the
in-place version), similarly to how e.g. __add__ dispatches to np.add.

newdot acts like a standard gufunc with all the standard niceties,
including __numpy_ufunc__ dispatch.

("newdot" here is intended as a placeholder name, maybe it should be
np.linalg.matmul or something else to be bikeshed later. I also vote
that eventually 'dot' become an alias for this function, but whether
to do that is an orthogonal discussion for later.)

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Sturla Molden
Charles R Harris  wrote:

> Note also that the dot cblas versions are not generally blocked, so the
> size of the arrays is limited (and not checked).

But it is possible to create a blocked dot function with the current cblas,
even though they use C int for array dimensions. It would just further
increase the complexity of dot (as if it's not bad enough already...) 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Charles R Harris
On Wed, Sep 10, 2014 at 2:53 PM, Charles R Harris  wrote:

>
>
> On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:
>
>> 09.09.2014, 22:52, Charles R Harris kirjoitti:
>> >   1. Should the operator accept array_like for one of the arguments?
>> >   2. Does it need to handle __numpy_ufunc__, or will
>> >   __array_priority__ serve?
>>
>> I think the __matmul__ operator implementation should follow that of
>> __mul__.
>>
>> [clip]
>> >3. Do we want PyArray_Matmul in the numpy API?
>> >4. Should a matmul function be supplied by the multiarray module?
>> >
>> > If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
>> > will __array_priority__ serve?
>>
>> dot() function deals with __numpy_ufunc__, and the matmul() function
>> should behave similarly.
>>
>> It seems dot() uses __array_priority__ for selection of output return
>> subclass, so matmul() probably needs do the same thing.
>>
>> > Note that the type number operators, __add__ and such, currently use
>> > __numpy_ufunc__ in combination with __array_priority__, this in
>> addition to
>> > the fact that they are by default using ufuncs that do the same. I'd
>> rather
>> > that the __*__ operators simply rely on __array_priority__.
>>
>> The whole business of __array_priority__ and __numpy_ufunc__ in the
>> binary ops is solely about when  should yield the execution to
>> __r__ of the other object.
>>
>> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
>>
>> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__ before
>> __numpy_ufunc__, except if array_priority happens to be smaller than
>> that of the other class and your class is not an ndarray subclass".
>>
>> The following binops also do not IIRC respect __array_priority__ in
>> preferring right-hand operand:
>>
>> - in-place operations
>> - comparisons
>>
>> One question here is whether it's possible to change the behavior of
>> __array_priority__ here at all, or whether changes are possible only in
>> the context of adding new attributes telling Numpy what to do.
>>
>>
> I was tempted to make it a generalized ufunc, which would take care of a
> lot of things, but there is a lot of overhead in those functions. Sounds
> like the easiest thing is to make it similar to dot, although having an
> inplace versions complicates the type selection a bit.
>

Note also that the dot cblas versions are not generally blocked, so the
size of the arrays is limited (and not checked).

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Charles R Harris
On Wed, Sep 10, 2014 at 10:52 AM, Pauli Virtanen  wrote:

> 09.09.2014, 22:52, Charles R Harris kirjoitti:
> >   1. Should the operator accept array_like for one of the arguments?
> >   2. Does it need to handle __numpy_ufunc__, or will
> >   __array_priority__ serve?
>
> I think the __matmul__ operator implementation should follow that of
> __mul__.
>
> [clip]
> >3. Do we want PyArray_Matmul in the numpy API?
> >4. Should a matmul function be supplied by the multiarray module?
> >
> > If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
> > will __array_priority__ serve?
>
> dot() function deals with __numpy_ufunc__, and the matmul() function
> should behave similarly.
>
> It seems dot() uses __array_priority__ for selection of output return
> subclass, so matmul() probably needs do the same thing.
>
> > Note that the type number operators, __add__ and such, currently use
> > __numpy_ufunc__ in combination with __array_priority__, this in addition
> to
> > the fact that they are by default using ufuncs that do the same. I'd
> rather
> > that the __*__ operators simply rely on __array_priority__.
>
> The whole business of __array_priority__ and __numpy_ufunc__ in the
> binary ops is solely about when  should yield the execution to
> __r__ of the other object.
>
> The rule of operation currently is: "__rmul__ before __numpy_ufunc__"
>
> If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__ before
> __numpy_ufunc__, except if array_priority happens to be smaller than
> that of the other class and your class is not an ndarray subclass".
>
> The following binops also do not IIRC respect __array_priority__ in
> preferring right-hand operand:
>
> - in-place operations
> - comparisons
>
> One question here is whether it's possible to change the behavior of
> __array_priority__ here at all, or whether changes are possible only in
> the context of adding new attributes telling Numpy what to do.
>
>
I was tempted to make it a generalized ufunc, which would take care of a
lot of things, but there is a lot of overhead in those functions. Sounds
like the easiest thing is to make it similar to dot, although having an
inplace versions complicates the type selection a bit.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Pauli Virtanen
09.09.2014, 22:52, Charles R Harris kirjoitti:
>   1. Should the operator accept array_like for one of the arguments?
>   2. Does it need to handle __numpy_ufunc__, or will
>   __array_priority__ serve?

I think the __matmul__ operator implementation should follow that of
__mul__.

[clip]
>3. Do we want PyArray_Matmul in the numpy API?
>4. Should a matmul function be supplied by the multiarray module?
> 
> If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
> will __array_priority__ serve?

dot() function deals with __numpy_ufunc__, and the matmul() function
should behave similarly.

It seems dot() uses __array_priority__ for selection of output return
subclass, so matmul() probably needs do the same thing.

> Note that the type number operators, __add__ and such, currently use
> __numpy_ufunc__ in combination with __array_priority__, this in addition to
> the fact that they are by default using ufuncs that do the same. I'd rather
> that the __*__ operators simply rely on __array_priority__.

The whole business of __array_priority__ and __numpy_ufunc__ in the
binary ops is solely about when  should yield the execution to
__r__ of the other object.

The rule of operation currently is: "__rmul__ before __numpy_ufunc__"

If you remove the __numpy_ufunc__ handling, it becomes: "__rmul__ before
__numpy_ufunc__, except if array_priority happens to be smaller than
that of the other class and your class is not an ndarray subclass".

The following binops also do not IIRC respect __array_priority__ in
preferring right-hand operand:

- in-place operations
- comparisons

One question here is whether it's possible to change the behavior of
__array_priority__ here at all, or whether changes are possible only in
the context of adding new attributes telling Numpy what to do.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-10 Thread Sebastian Berg
On Di, 2014-09-09 at 13:52 -0600, Charles R Harris wrote:
> Hi All,
> 
> 
> I'm in the midst of implementing the '@' operator (PEP 465), and there
> are some behaviors that are unspecified by the PEP.
> 
>  1. Should the operator accept array_like for one of the
> arguments?

To be in line with all the other operators, I would say yes

>  1. Does it need to handle __numpy_ufunc__, or will
> __array_priority__ serve?
>  2. Do we want PyArray_Matmul in the numpy API?

Don't care much either way, but I would say yes (why not).

>  1. Should a matmul function be supplied by the multiarray module?
> 

We could possibly put it to the linalg module. But I think we should
provide such a function.

> If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery,
> or will __array_priority__ serve?
> 
> Note that the type number operators, __add__ and such, currently use
> __numpy_ufunc__ in combination with __array_priority__, this in
> addition to the fact that they are by default using ufuncs that do the
> same. I'd rather that the __*__ operators simply rely on
> __array_priority__.
> 

Hmmm, that is a difficult one. For the operators I agree, numpy_ufunc
should not be necessary, since we have NotImplemented there, and the
array priority does nothing except tell numpy to stop handling all array
likes (i.e. other array-likes could actually say: you have an
array-priority > 0 -> return NotImplemented). So yeah, why do the
operators use numpy_ufunc at all? (unless due to implementation)

If we have a function numpy_ufunc would probably make sense, since that
circumvents the python dispatch mechanism.

- Sebastian

> 
> Thoughts?
> 
> Chuck
> 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] @ operator

2014-09-09 Thread Robert Kern
On Tue, Sep 9, 2014 at 8:52 PM, Charles R Harris
 wrote:
> Hi All,
>
> I'm in the midst of implementing the '@' operator (PEP 465), and there are
> some behaviors that are unspecified by the PEP.
>
> Should the operator accept array_like for one of the arguments?

I would be mildly disappointed if it didn't.

> Does it need to handle __numpy_ufunc__, or will __array_priority__ serve?

Not sure (TBH, I don't remember what __numpy_ufunc__ does off-hand and
don't feel bothered enough to look it up).

> Do we want PyArray_Matmul in the numpy API?

Probably.

> Should a matmul function be supplied by the multiarray module?

Yes, please. It's rules are a little different than dot()'s, so we
should have a function that does it.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] @ operator

2014-09-09 Thread Charles R Harris
Hi All,

I'm in the midst of implementing the '@' operator (PEP 465), and there are
some behaviors that are unspecified by the PEP.


   1. Should the operator accept array_like for one of the arguments?
   2. Does it need to handle __numpy_ufunc__, or will __array_priority__
   serve?
   3. Do we want PyArray_Matmul in the numpy API?
   4. Should a matmul function be supplied by the multiarray module?

If 3 and 4 are wanted, should they use the __numpy_ufunc__ machinery, or
will __array_priority__ serve?

Note that the type number operators, __add__ and such, currently use
__numpy_ufunc__ in combination with __array_priority__, this in addition to
the fact that they are by default using ufuncs that do the same. I'd rather
that the __*__ operators simply rely on __array_priority__.


Thoughts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Christopher Barker
On 3/16/11 9:22 AM, Paul Anton Letnes wrote:

>> This comes up for discussion on a fairly regular basis. I tend towards the 
>> more warnings side myself, but you aren't going to get the current behavior 
>> changed unless you can convince a large bunch of people that it is the right 
>> thing to do, which won't be easy.

Indeed, I don't think I'd want a warning for this -- though honestly, 
I'm not sure what you think should invoke a warning:


a = np.ones((3,), dtype=np.float32)

a + = 2.0

Should that be a warning? you are adding a 64 bit float to a 32bit float 
array.

a = np.ones((3,), dtype=np.float32)

a + = 2

now you are adding an integer -- should that be a warning?

a = np.ones((3,), dtype=np.int32)

a + = 2.1

now a float to an integer array -- should that be a warning?

As you can see -- there are a lot of options. As there are defined as 
in-place operators, I don't expect any casting to occur.

I think this is the kind of thing that would be great to have a warning 
the first time you do it, but once you understand, the warnings would be 
really, really annoying!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Paul Anton Letnes
> 
> This comes up for discussion on a fairly regular basis. I tend towards the 
> more warnings side myself, but you aren't going to get the current behavior 
> changed unless you can convince a large bunch of people that it is the right 
> thing to do, which won't be easy. For one thing, a lot of current code in the 
> wild would begin to raise warnings that weren't there before.
That could also be a good thing for locating bugs, right?

> Or: give me a hint how and where to change the numpy code, and I could try to 
> write a patch.
> 
> 
> You have to go down to the C level to deal with this.

I guess code containing such warnings must exist in other parts of the numpy 
library?

Paul.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Charles R Harris
On Wed, Mar 16, 2011 at 8:58 AM, Paul Anton Letnes <
paul.anton.let...@gmail.com> wrote:

>
> On 16. mars 2011, at 15.49, Chris Barker wrote:
>
> > On 3/16/11 6:34 AM, Charles R Harris wrote:
> >> On Wed, Mar 16, 2011 at 7:24 AM, Paul Anton Letnes
> >
> >> Yes, it is intentional. Numpy is more C than Python in this case,
> >
> > I don't know that C has anything to do with it -- the *= operators were
> > added specifically to be "in-place" operators -- otherwise they would be
> > nothing but syntactic sugar. And IIRC, numpy was one of the motivators.
> >
> > IMHO, the mistake was even allowing += and friends for immutables, as
> > that inherently means something different.
> >
> > Of course, using += with integers is probably the most common case.
> >
> > -Chris
>
> I see. In that case, I have the following either/or christmas wish:
>
> Either: implement a warning along the following lines:
> >>> from numpy import *
> >>> a = zeros(10, dtype=complex)
> >>> a.astype(float)
> /Users/paulanto/Library/Python/2.7/bin/bpython:2: ComplexWarning: Casting
> complex values to real discards the imaginary part
>  # EASY-INSTALL-ENTRY-SCRIPT:
> 'bpython==0.9.7.1','console_scripts','bpython'
> array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
>
>
This comes up for discussion on a fairly regular basis. I tend towards the
more warnings side myself, but you aren't going to get the current behavior
changed unless you can convince a large bunch of people that it is the right
thing to do, which won't be easy. For one thing, a lot of current code in
the wild would begin to raise warnings that weren't there before.

Or: give me a hint how and where to change the numpy code, and I could try
> to write a patch.
>
>
You have to go down to the C level to deal with this.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Paul Anton Letnes

On 16. mars 2011, at 15.57, Dag Sverre Seljebotn wrote:

> On 03/16/2011 02:35 PM, Paul Anton Letnes wrote:
>> Heisann!
> 
> Hei der,
> 
>> On 16. mars 2011, at 14.30, Dag Sverre Seljebotn wrote:
>> 
>>> On 03/16/2011 02:24 PM, Paul Anton Letnes wrote:
 Hi!
 
 This little snippet of code tricked me (in a more convoluted form). The *= 
 operator does not change the datatype of the left hand side array. Is this 
 intentional? It did fool me and throw my results quite a bit off. I always 
 assumed that 'a *= b' means exactly the same as 'a = a * b' but this is 
 clearly not the case!
>>> 
>>> In [1]: a = np.ones(5)
>> Here, a is numpy.float64:
> numpy.ones(5).dtype
>> dtype('float64')
>> 
>>> In [2]: b = a
>>> 
>>> In [3]: c = a * 2
>>> 
>>> In [4]: b
>>> Out[4]: array([ 1.,  1.,  1.,  1.,  1.])
>>> 
>>> In [5]: a *= 2
>> So since a is already float, and b is the same object as a, the resulting a 
>> and b are of course floats.
>>> In [6]: b
>>> Out[6]: array([ 2.,  2.,  2.,  2.,  2.])
>>> 
>> This is not the case I am describing, as in my case, a was of dtype integer.
>> Or did I miss something?
> 
> I was just trying to demonstrate that it is NOT the case that "a = a * 2 
> is exactly the same as a *= 2". If you assume that the two statements 
> are the same, then it does not make sense that b = [1, 1, ...] the first 
> time around, but b = [2, 2, 2...] the second time around. And in trying 
> to figure out why that happened, perhaps you'd see how it all fits 
> together...
> 
> OK, it perhaps wasn't a very good explanation...
> 
> Dag Sverre

I see, I misunderstood your point. That's another interesting aspect of this, 
though.

Paul.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Paul Anton Letnes

On 16. mars 2011, at 15.49, Chris Barker wrote:

> On 3/16/11 6:34 AM, Charles R Harris wrote:
>> On Wed, Mar 16, 2011 at 7:24 AM, Paul Anton Letnes
> 
>> Yes, it is intentional. Numpy is more C than Python in this case,
> 
> I don't know that C has anything to do with it -- the *= operators were 
> added specifically to be "in-place" operators -- otherwise they would be 
> nothing but syntactic sugar. And IIRC, numpy was one of the motivators.
> 
> IMHO, the mistake was even allowing += and friends for immutables, as 
> that inherently means something different.
> 
> Of course, using += with integers is probably the most common case.
> 
> -Chris

I see. In that case, I have the following either/or christmas wish:

Either: implement a warning along the following lines:
>>> from numpy import *
>>> a = zeros(10, dtype=complex)
>>> a.astype(float)
/Users/paulanto/Library/Python/2.7/bin/bpython:2: ComplexWarning: Casting 
complex values to real discards the imaginary part
  # EASY-INSTALL-ENTRY-SCRIPT: 'bpython==0.9.7.1','console_scripts','bpython'
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

Or: give me a hint how and where to change the numpy code, and I could try to 
write a patch.

Paul.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Dag Sverre Seljebotn
On 03/16/2011 02:35 PM, Paul Anton Letnes wrote:
> Heisann!

Hei der,

> On 16. mars 2011, at 14.30, Dag Sverre Seljebotn wrote:
>
>> On 03/16/2011 02:24 PM, Paul Anton Letnes wrote:
>>> Hi!
>>>
>>> This little snippet of code tricked me (in a more convoluted form). The *= 
>>> operator does not change the datatype of the left hand side array. Is this 
>>> intentional? It did fool me and throw my results quite a bit off. I always 
>>> assumed that 'a *= b' means exactly the same as 'a = a * b' but this is 
>>> clearly not the case!
>>
>> In [1]: a = np.ones(5)
> Here, a is numpy.float64:
 numpy.ones(5).dtype
> dtype('float64')
>
>> In [2]: b = a
>>
>> In [3]: c = a * 2
>>
>> In [4]: b
>> Out[4]: array([ 1.,  1.,  1.,  1.,  1.])
>>
>> In [5]: a *= 2
> So since a is already float, and b is the same object as a, the resulting a 
> and b are of course floats.
>> In [6]: b
>> Out[6]: array([ 2.,  2.,  2.,  2.,  2.])
>>
> This is not the case I am describing, as in my case, a was of dtype integer.
> Or did I miss something?

I was just trying to demonstrate that it is NOT the case that "a = a * 2 
is exactly the same as a *= 2". If you assume that the two statements 
are the same, then it does not make sense that b = [1, 1, ...] the first 
time around, but b = [2, 2, 2...] the second time around. And in trying 
to figure out why that happened, perhaps you'd see how it all fits 
together...

OK, it perhaps wasn't a very good explanation...

Dag Sverre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Chris Barker
On 3/16/11 6:34 AM, Charles R Harris wrote:
> On Wed, Mar 16, 2011 at 7:24 AM, Paul Anton Letnes

> Yes, it is intentional. Numpy is more C than Python in this case,

I don't know that C has anything to do with it -- the *= operators were 
added specifically to be "in-place" operators -- otherwise they would be 
nothing but syntactic sugar. And IIRC, numpy was one of the motivators.

IMHO, the mistake was even allowing += and friends for immutables, as 
that inherently means something different.

Of course, using += with integers is probably the most common case.

-Chris



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Charles R Harris
On Wed, Mar 16, 2011 at 7:24 AM, Paul Anton Letnes <
paul.anton.let...@gmail.com> wrote:

> Hi!
>
> This little snippet of code tricked me (in a more convoluted form). The *=
> operator does not change the datatype of the left hand side array. Is this
> intentional? It did fool me and throw my results quite a bit off. I always
> assumed that 'a *= b' means exactly the same as 'a = a * b' but this is
> clearly not the case!
>
>
Yes, it is intentional. Numpy is more C than Python in this case, it
actually does the multiplication in-place so that the result must have the
same type as the left hand side. In this case Python just creates a new
object.



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Paul Anton Letnes
Heisann!

On 16. mars 2011, at 14.30, Dag Sverre Seljebotn wrote:

> On 03/16/2011 02:24 PM, Paul Anton Letnes wrote:
>> Hi!
>> 
>> This little snippet of code tricked me (in a more convoluted form). The *= 
>> operator does not change the datatype of the left hand side array. Is this 
>> intentional? It did fool me and throw my results quite a bit off. I always 
>> assumed that 'a *= b' means exactly the same as 'a = a * b' but this is 
>> clearly not the case!
> 
> 
> In [1]: a = np.ones(5)
Here, a is numpy.float64:
>>> numpy.ones(5).dtype
dtype('float64')

> In [2]: b = a
> 
> In [3]: c = a * 2
> 
> In [4]: b
> Out[4]: array([ 1.,  1.,  1.,  1.,  1.])
> 
> In [5]: a *= 2
So since a is already float, and b is the same object as a, the resulting a and 
b are of course floats.
> 
> In [6]: b
> Out[6]: array([ 2.,  2.,  2.,  2.,  2.])
> 
This is not the case I am describing, as in my case, a was of dtype integer.
Or did I miss something?

Paul.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Dag Sverre Seljebotn
On 03/16/2011 02:24 PM, Paul Anton Letnes wrote:
> Hi!
>
> This little snippet of code tricked me (in a more convoluted form). The *= 
> operator does not change the datatype of the left hand side array. Is this 
> intentional? It did fool me and throw my results quite a bit off. I always 
> assumed that 'a *= b' means exactly the same as 'a = a * b' but this is 
> clearly not the case!


In [1]: a = np.ones(5)

In [2]: b = a

In [3]: c = a * 2

In [4]: b
Out[4]: array([ 1.,  1.,  1.,  1.,  1.])

In [5]: a *= 2

In [6]: b
Out[6]: array([ 2.,  2.,  2.,  2.,  2.])


-- Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Angus McMorland
On 16 March 2011 09:24, Paul Anton Letnes  wrote:
> Hi!
>
> This little snippet of code tricked me (in a more convoluted form). The *= 
> operator does not change the datatype of the left hand side array. Is this 
> intentional? It did fool me and throw my results quite a bit off. I always 
> assumed that 'a *= b' means exactly the same as 'a = a * b' but this is 
> clearly not the case!

This is intentional: a *= b works inplace, i.e. it's the equivalent,
not of a = a * b, but of a[:] = a * b

Angus.

> Paul.
>
> ++
 from numpy import *
 a = arange(10)
 b = linspace(0,1,10)
 a
> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 b
> array([ 0.        ,  0.,  0.,  0.,  0.,
>       0.5556,  0.6667,  0.7778,  0.8889,  1.        ])
 a * b
> array([ 0.        ,  0.,  0.,  1.        ,  1.7778,
>       2.7778,  4.        ,  5.,  7.,  9.        ])
 a *= b
 a
> array([0, 0, 0, 1, 1, 2, 4, 5, 7, 9])
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] *= operator not intuitive

2011-03-16 Thread Paul Anton Letnes
Hi!

This little snippet of code tricked me (in a more convoluted form). The *= 
operator does not change the datatype of the left hand side array. Is this 
intentional? It did fool me and throw my results quite a bit off. I always 
assumed that 'a *= b' means exactly the same as 'a = a * b' but this is clearly 
not the case!

Paul.

++
>>> from numpy import *
>>> a = arange(10)
>>> b = linspace(0,1,10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b
array([ 0.,  0.,  0.,  0.,  0.,
   0.5556,  0.6667,  0.7778,  0.8889,  1.])
>>> a * b
array([ 0.,  0.,  0.,  1.,  1.7778,
   2.7778,  4.,  5.,  7.,  9.])
>>> a *= b
>>> a
array([0, 0, 0, 1, 1, 2, 4, 5, 7, 9])


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion