On 05/31/2018 09:53 AM, Sebastian Berg wrote:
Also, I do not imagine these as free-floating ufuncs, I think we can
arrange them in a logical way in a gufunc ecosystem. There would be
"core ufuncs", with "associated gufuncs" accessible as attributes.
instance, any_less_than will be accessible as less.any

So then, why is it a gufunc and not an attribute using a ufunc with
binary output? I have asked this before, and even got arguments as to
why it fits gufuncs better, but frankly I still do not really

If it is an associated gufunc, why gufunc at all? We need any() and
all() here, so that is not that many methods, right? And when it comes
to buffering you have much more flexibility.

Say I have the operation:

(float_arr > int_arr).all(axis=(1, 2))

With int_arr being shaped (2, 1000, 1000) (i.e. large along the
interesting axes). A normal gufunc IIRC will get the whole inner
dimension as a float buffer. In other words, you gain practically
nothing, because the whole int_arr will be cast to float anyway.

If, however, you actually implement np.greater_than.all(float_arr,
int_arr, axis=(1, 2)) as a separate ufunc method, you would have the
freedom to work in the typical cache friendly buffersize chunk size for
each of the outer dimensions one at a time. A gufunc would require to
say: please do not buffer for me, or implement all possible type
combinations to do this.
(of course there are memory layout subtleties, since you would have to
optimize always for the "fast exit" case, potentially making the worst
case scenario much worse -- unless you do seriously fancy stuff

A more general question is actually whether we should rather focus on
solving the same problem more generally.
For example if `numexpr` would implement all/any reductions, it may be
able to pretty simply get the identical tradeoffs with even more
flexibility! (I have to admit, it may get tricky with multiple
reduction dimensions, etc.)

- Sebastian

Hmm, I hadn't known/considered the limitations of gufunc buffer sizes. I was just thinking of them as a standardized interface which handles the where/out/broadcasting for you.

I'll have to read about it.

One thing I don't like about the ufunc-method strategy is that it esily pollutes all the ufuncs namespaces and their implementations, so many ufuncs have to account for a new "all" method even if innapropriate, for example.

NumPy-Discussion mailing list

Reply via email to