On Thu, Feb 6, 2020 at 12:20 PM Sebastian Berg <sebast...@sipsolutions.net> wrote:
> > It is less clear how this could work for __array_module__, because > > __array_module__ and get_array_module() are not generic -- they > > refers explicitly to a NumPy like module. If we want to extend it to > > SciPy (for which I agree there are good use-cases), what should that > > look __array_module__` > > I suppose the question is here, where should the code reside? For > SciPy, I agree there is a good reason why you may want to "reverse" the > implementation. The code to support JAX arrays, should live inside JAX. > > One, probably silly, option is to return a "global" namespace, so that: > > np = get_array_module(*arrays).numpy` > > My main concern with a "global namespace" is that it adds boilerplate to the typical usage of fetching a duck-array version of NumPy. I think the simplest proposal is to add a "module" argument to both get_array_module and __array_module__, with a default value of "numpy". This adds flexibility with minimal additional complexity. The main question is what the type of arguments for "module" should be: 1. Modules could be specified as strings, e.g., "numpy" 2. Module could be specified as actual namespace, e.g., numpy from import numpy. The advantage of (1) is that in theory you could write np.get_array_module(*arrays, module='scipy.linalg') without the overhead of actually importing scipy.linalg or without even needing scipy to be installed, if all the arrays use a different scipy.linalg implementation. But in practice, this seems a little far-fetched. All alternative implementations of scipy that I know of (e.g., in JAX or conceivably in Dask) import the original library. The main downside of (1) is that it would would mean that NumPy's ndarray.__array_module__ would need to use importlib.import_module() to dynamically import modules. It also adds a potentially awkward asymmetry between the "module" and "default" arguments, unless we also switched default to specify modules with strings. Either way, the "default" argument will probably need to be adjusted so that by default it matches whatever value is passed into "module", instead of always defaulting to "numpy". Any thoughts on which of these options makes most sense? We could also put off making any changes to the protocol now, but this change seems pretty safe and appear to have real use-cases (e.g., for sklearn) so I am inclined to go ahead with it now before finalizing the NEP. > We have to distinct issues: Where should e.g. SciPy put a generic > implementation (assuming they to provide implementations that only > require NumPy-API support to not require overriding)? > And, also if a library provides generic support, should we define a > standard of how the context/namespace may be passed in/provided? > > sklearn's main namespace is expected to support many array > objects/types, but it could be nice to pass in an already known > context/namespace (say scikit-image already found it, and then calls > scikit-learn internally). A "generic" namespace may even require this > to infer the correct output array object. > > > Another thing about backward compatibility: What is our vision there > actually? > This NEP will *not* give the *end user* the option to opt-in! Here, > opt-in is really reserved to the *library user* (e.g. sklearn). (I did > not realize this clearly before) > > Thinking about that for a bit now, that seems like the right choice. > But it also means that the library requires an easy way of giving a > FutureWarning, to notify the end-user of the upcoming change. The end- > user will easily be able to convert to a NumPy array to keep the old > behaviour. > Once this warning is given (maybe during `get_array_module()`, the > array module object/context would preferably be passed around, > hopefully even between libraries. That provides a reasonable way to > opt-in to the new behaviour without a warning (mainly for library > users, end-users can silence the warning if they wish so). > I don't think NumPy needs to do anything about warnings. It is straightforward for libraries that want to use use get_array_module() to issue their own warnings before calling get_array_module(), if desired. Or alternatively, if a library is about to add a new __array_module__ method, it is straightforward to issue a warning inside the new __array_module__ method before returning the NumPy functions.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion