Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-10 Thread Chris Barker
On Sat, Mar 10, 2018 at 1:27 PM, Matthew Rocklin  wrote:

> I'm very glad to see this discussion.
>

me too, but


> I think that coming up with a single definition of array-like may be
> difficult, and that we might end up wanting to embrace duck typing instead.
>

exactly -- I think there is a clear line between "uses the numpy memory
layout" and the Python API. But the python API is pretty darn big, and many
"array_ish" classes implement only partvof it, and may even implement some
parts a bit differently. So really hard to have "one" definition, except
"Python API exactly like a ndarray" -- and I'm wondering how useful that is.

It seems to me that different array-like classes will implement different
> mixtures of features.  It may be difficult to pin down a single definition
> that includes anything except for the most basic attributes (shape and
> dtype?).
>

or a minimum set -- but again, how useful??


> Storage objects like h5py (support getitem in a numpy-like way)
>

Exactly -- though I don't know about h5py, but netCDF4 variables supoprt a
useful subst of ndarray, but do "fancy indexing" differently -- so are they
ndarray_ish? -- sorry to coin yet another term :-)


> I can imagine authors of both groups saying that they should qualify as
> array-like because downstream projects that consume them should not convert
> them to numpy arrays in important contexts.
>

indeed. My solution so far is to define my own duck types "asarraylike"
that checks for the actual methods I need:

https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/utilities.py

which has:

must_have = ['dtype', 'shape', 'ndim', '__len__', '__getitem__', '
__getattribute__']

def isarraylike(obj):
"""
tests if obj acts enough like an array to be used in gridded.
This should catch netCDF4 variables and numpy arrays, at least, etc.
Note: these won't check if the attributes required actually work right.
"""
for attr in must_have:
if not hasattr(obj, attr):
return False
return True
def asarraylike(obj):
"""
If it satisfies the requirements of pyugrid the object is returned as is.
If not, then numpy's array() will be called on it.
:param obj: The object to check if it's like an array
"""
return obj if isarraylike(obj) else np.array(obj)

It's possible that we could come up with semi-standard "groupings" of
attributes to produce "levels" of compatibility, or maybe not levels, but
independentgroupings, so you could specify which groupings you need in this
instance.


> The name "duck arrays" that we sometimes use doesn't necessarily mean
> "quack like an ndarray" but might actually mean a number of different
> things in different contexts.  Making a single class or predicate for duck
> arrays may not be as effective as we want.  Instead, it might be that we
> need a number of different protocols like `__array_mat_vec__` or 
> `__array_slice__`
> that downstream projects can check instead.  I can imagine cases where I
> want to check only "can I use this thing to multiply against arrays" or
> "can I get numpy arrays out of this thing with numpy slicing" rather than
> "is this thing array-like" because I may genuinely not care about most of
> the functionality in a blessed definition of "array-like".
>

exactly.

but maybe we won't know until we try.

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-10 Thread Matthew Rocklin
I'm very glad to see this discussion.

I think that coming up with a single definition of array-like may be
difficult, and that we might end up wanting to embrace duck typing instead.

It seems to me that different array-like classes will implement different
mixtures of features.  It may be difficult to pin down a single definition
that includes anything except for the most basic attributes (shape and
dtype?).  Consider two extreme cases of restrictive functionality:

   1. LinearOperators (support dot in a numpy-like way)
   2. Storage objects like h5py (support getitem in a numpy-like way)

I can imagine authors of both groups saying that they should qualify as
array-like because downstream projects that consume them should not convert
them to numpy arrays in important contexts.

The name "duck arrays" that we sometimes use doesn't necessarily mean
"quack like an ndarray" but might actually mean a number of different
things in different contexts.  Making a single class or predicate for duck
arrays may not be as effective as we want.  Instead, it might be that we
need a number of different protocols like `__array_mat_vec__` or
`__array_slice__`
that downstream projects can check instead.  I can imagine cases where I
want to check only "can I use this thing to multiply against arrays" or
"can I get numpy arrays out of this thing with numpy slicing" rather than
"is this thing array-like" because I may genuinely not care about most of
the functionality in a blessed definition of "array-like".

On Fri, Mar 9, 2018 at 8:45 PM, Nathaniel Smith  wrote:

> On Thu, Mar 8, 2018 at 5:51 PM, Juan Nunez-Iglesias 
> wrote:
> >> Finally for the name, what about `asduckarray`? Thought perhaps that
> could
> >> be a source of confusion, and given the gradation of duck array like
> types.
> >
> > I suggest that the name should *not* use programmer lingo, so neither
> > "abstract" nor "duck" should be in there. My humble proposal is
> "arraylike".
> > (I know that this term has included things like "list-of-list" before but
> > only in text, not code, as far as I know.)
>
> I agree with your point about avoiding programmer lingo. My first
> draft actually used 'asduckarray', but that's like an in-joke; it
> works fine for us, but it's not really something I want teachers to
> have to explain on day 1...
>
> Array-like is problematic too though, because we still need a way to
> say "thing that can be coerced to an array", which is what array-like
> has been used to mean historically. And with the new type hints stuff,
> it is actually becoming code. E.g. what should the type hints here be:
>
> asabstractarray(a: X) -> Y
>
> Right now "X" is "ArrayLike", but if we make "Y" be "ArrayLike" then
> we'll need to come up with some other name for "X" :-).
>
> Maybe we can call duck arrays "py arrays", since the idea is that they
> implement the standard Python array API (but not necessarily the
> C-level array API)? np.PyArray, np.aspyarray()?
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion