On Sat, Mar 10, 2018 at 1:27 PM, Matthew Rocklin <mrock...@gmail.com> wrote:
> I'm very glad to see this discussion. > me too, but.... > I think that coming up with a single definition of array-like may be > difficult, and that we might end up wanting to embrace duck typing instead. > exactly -- I think there is a clear line between "uses the numpy memory layout" and the Python API. But the python API is pretty darn big, and many "array_ish" classes implement only partvof it, and may even implement some parts a bit differently. So really hard to have "one" definition, except "Python API exactly like a ndarray" -- and I'm wondering how useful that is. It seems to me that different array-like classes will implement different > mixtures of features. It may be difficult to pin down a single definition > that includes anything except for the most basic attributes (shape and > dtype?). > or a minimum set -- but again, how useful?? > Storage objects like h5py (support getitem in a numpy-like way) > Exactly -- though I don't know about h5py, but netCDF4 variables supoprt a useful subst of ndarray, but do "fancy indexing" differently -- so are they ndarray_ish? -- sorry to coin yet another term :-) > I can imagine authors of both groups saying that they should qualify as > array-like because downstream projects that consume them should not convert > them to numpy arrays in important contexts. > indeed. My solution so far is to define my own duck types "asarraylike" that checks for the actual methods I need: https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/utilities.py which has: must_have = ['dtype', 'shape', 'ndim', '__len__', '__getitem__', ' __getattribute__'] def isarraylike(obj): """ tests if obj acts enough like an array to be used in gridded. This should catch netCDF4 variables and numpy arrays, at least, etc. Note: these won't check if the attributes required actually work right. """ for attr in must_have: if not hasattr(obj, attr): return False return True def asarraylike(obj): """ If it satisfies the requirements of pyugrid the object is returned as is. If not, then numpy's array() will be called on it. :param obj: The object to check if it's like an array """ return obj if isarraylike(obj) else np.array(obj) It's possible that we could come up with semi-standard "groupings" of attributes to produce "levels" of compatibility, or maybe not levels, but independentgroupings, so you could specify which groupings you need in this instance. > The name "duck arrays" that we sometimes use doesn't necessarily mean > "quack like an ndarray" but might actually mean a number of different > things in different contexts. Making a single class or predicate for duck > arrays may not be as effective as we want. Instead, it might be that we > need a number of different protocols like `__array_mat_vec__` or > `__array_slice__` > that downstream projects can check instead. I can imagine cases where I > want to check only "can I use this thing to multiply against arrays" or > "can I get numpy arrays out of this thing with numpy slicing" rather than > "is this thing array-like" because I may genuinely not care about most of > the functionality in a blessed definition of "array-like". > exactly. but maybe we won't know until we try. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion