On Wed, Feb 5, 2020 at 12:14 PM Eric Wieser <wieser.eric+nu...@gmail.com> wrote:
> > scipy.linalg is a superset of numpy.linalg > > This isn't completely accurate - numpy.linalg supports almost all > operations* over stacks of matrices via gufuncs, but scipy.linalg does not > appear to. > > Eric > > *: not lstsq due to an ungeneralizable public API > That's true for `qr` as well I believe. Indeed some functions have diverged slightly, but that's not on purpose, more like a lack of time to coordinate. We would like to fix that so everything is in sync and fully API-compatible again. Ralf > On Wed, 5 Feb 2020 at 17:38, Ralf Gommers <ralf.gomm...@gmail.com> wrote: > >> >> >> On Wed, Feb 5, 2020 at 10:01 AM Andreas Mueller <t3k...@gmail.com> wrote: >> >>> A bit late to the NEP 37 party. >>> I just wanted to say that at least from my perspective it seems a great >>> solution that will help sklearn move towards more flexible compute engines. >>> I think one of the biggest issues is array creation (including random >>> arrays), and that's handled quite nicely with NEP 37. >>> >>> There's some discussion on the scikit-learn side here: >>> https://github.com/scikit-learn/scikit-learn/pull/14963 >>> https://github.com/scikit-learn/scikit-learn/issues/11447 >>> >>> Two different groups of people tried to use __array_function__ to >>> delegate to MxNet and CuPy respectively in scikit-learn, and ran into the >>> same issues. >>> >>> There's some remaining issues in sklearn that will not be handled by NEP >>> 37 but they go beyond NumPy in some sense. >>> Just to briefly bring them up: >>> >>> - We use scipy.linalg in many places, and we would need to do a separate >>> dispatching to check whether we can use module.linalg instead >>> (that might be an issue for many libraries but I'm not sure). >>> >> >> That is an issue, and goes in the opposite direction we need - >> scipy.linalg is a superset of numpy.linalg, so we'd like to encourage using >> scipy. This is something we may want to consider fixing by making the >> dispatch decorator public in numpy and adopting in scipy. >> >> Cheers, >> Ralf >> >> >> >>> >>> - Some models have several possible optimization algorithms, some of >>> which are pure numpy and some which are Cython. If someone provides a >>> different array module, >>> we might want to choose an algorithm that is actually supported by that >>> module. While this exact issue is maybe sklearn specific, a similar issue >>> could appear for most downstream libs that use Cython in some places. >>> Many Cython algorithms could be implemented in pure numpy with a >>> potential slowdown, but once we have NEP 37 there might be a benefit to >>> having a pure NumPy implementation as an alternative code path. >>> >>> >>> Anyway, NEP 37 seems a great step in the right direction and would >>> enable sklearn to actually dispatch in some places. Dispatching just based >>> on __array_function__ seems not really feasible so far. >>> >>> Best, >>> Andreas Mueller >>> >>> >>> On 1/6/20 11:29 PM, Stephan Hoyer wrote: >>> >>> I am pleased to present a new NumPy Enhancement Proposal for discussion: >>> "NEP-37: A dispatch protocol for NumPy-like modules." Feedback would be >>> very welcome! >>> >>> The full text follows. The rendered proposal can also be found online at >>> https://numpy.org/neps/nep-0037-array-module.html >>> >>> Best, >>> Stephan Hoyer >>> >>> =================================================== >>> NEP 37 — A dispatch protocol for NumPy-like modules >>> =================================================== >>> >>> :Author: Stephan Hoyer <sho...@google.com> >>> :Author: Hameer Abbasi >>> :Author: Sebastian Berg >>> :Status: Draft >>> :Type: Standards Track >>> :Created: 2019-12-29 >>> >>> Abstract >>> -------- >>> >>> NEP-18's ``__array_function__`` has been a mixed success. Some projects >>> (e.g., >>> dask, CuPy, xarray, sparse, Pint) have enthusiastically adopted it. >>> Others >>> (e.g., PyTorch, JAX, SciPy) have been more reluctant. Here we propose a >>> new >>> protocol, ``__array_module__``, that we expect could eventually subsume >>> most >>> use-cases for ``__array_function__``. The protocol requires explicit >>> adoption >>> by both users and library authors, which ensures backwards >>> compatibility, and >>> is also significantly simpler than ``__array_function__``, both of which >>> we >>> expect will make it easier to adopt. >>> >>> Why ``__array_function__`` hasn't been enough >>> --------------------------------------------- >>> >>> There are two broad ways in which NEP-18 has fallen short of its goals: >>> >>> 1. **Maintainability concerns**. `__array_function__` has significant >>> implications for libraries that use it: >>> >>> - Projects like `PyTorch >>> <https://github.com/pytorch/pytorch/issues/22402>`_, `JAX >>> <https://github.com/google/jax/issues/1565>`_ and even >>> `scipy.sparse >>> <https://github.com/scipy/scipy/issues/10362>`_ have been >>> reluctant to >>> implement `__array_function__` in part because they are concerned >>> about >>> **breaking existing code**: users expect NumPy functions like >>> ``np.concatenate`` to return NumPy arrays. This is a fundamental >>> limitation of the ``__array_function__`` design, which we chose to >>> allow >>> overriding the existing ``numpy`` namespace. >>> - ``__array_function__`` currently requires an "all or nothing" >>> approach to >>> implementing NumPy's API. There is no good pathway for **incremental >>> adoption**, which is particularly problematic for established >>> projects >>> for which adopting ``__array_function__`` would result in breaking >>> changes. >>> - It is no longer possible to use **aliases to NumPy functions** >>> within >>> modules that support overrides. For example, both CuPy and JAX set >>> ``result_type = np.result_type``. >>> - Implementing **fall-back mechanisms** for unimplemented NumPy >>> functions >>> by using NumPy's implementation is hard to get right (but see the >>> `version from dask <https://github.com/dask/dask/pull/5043>`_), >>> because >>> ``__array_function__`` does not present a consistent interface. >>> Converting all arguments of array type requires recursing into >>> generic >>> arguments of the form ``*args, **kwargs``. >>> >>> 2. **Limitations on what can be overridden.** ``__array_function__`` has >>> some >>> important gaps, most notably array creation and coercion functions: >>> >>> - **Array creation** routines (e.g., ``np.arange`` and those in >>> ``np.random``) need some other mechanism for indicating what type of >>> arrays to create. `NEP 36 < >>> https://github.com/numpy/numpy/pull/14715>`_ >>> proposed adding optional ``like=`` arguments to functions without >>> existing array arguments. However, we still lack any mechanism to >>> override methods on objects, such as those needed by >>> ``np.random.RandomState``. >>> - **Array conversion** can't reuse the existing coercion functions >>> like >>> ``np.asarray``, because ``np.asarray`` sometimes means "convert to >>> an >>> exact ``np.ndarray``" and other times means "convert to something >>> _like_ >>> a NumPy array." This led to the `NEP 30 >>> <https://numpy.org/neps/nep-0030-duck-array-protocol.html>`_ >>> proposal for >>> a separate ``np.duckarray`` function, but this still does not >>> resolve how >>> to cast one duck array into a type matching another duck array. >>> >>> ``get_array_module`` and the ``__array_module__`` protocol >>> ---------------------------------------------------------- >>> >>> We propose a new user-facing mechanism for dispatching to a duck-array >>> implementation, ``numpy.get_array_module``. ``get_array_module`` >>> performs the >>> same type resolution as ``__array_function__`` and returns a module with >>> an API >>> promised to match the standard interface of ``numpy`` that can implement >>> operations on all provided array types. >>> >>> The protocol itself is both simpler and more powerful than >>> ``__array_function__``, because it doesn't need to worry about actually >>> implementing functions. We believe it resolves most of the >>> maintainability and >>> functionality limitations of ``__array_function__``. >>> >>> The new protocol is opt-in, explicit and with local control; see >>> :ref:`appendix-design-choices` for discussion on the importance of these >>> design >>> features. >>> >>> The array module contract >>> ========================= >>> >>> Modules returned by ``get_array_module``/``__array_module__`` should >>> make a >>> best effort to implement NumPy's core functionality on new array >>> types(s). >>> Unimplemented functionality should simply be omitted (e.g., accessing an >>> unimplemented function should raise ``AttributeError``). In the future, >>> we >>> anticipate codifying a protocol for requesting restricted subsets of >>> ``numpy``; >>> see :ref:`requesting-restricted-subsets` for more details. >>> >>> How to use ``get_array_module`` >>> =============================== >>> >>> Code that wants to support generic duck arrays should explicitly call >>> ``get_array_module`` to determine an appropriate array module from which >>> to >>> call functions, rather than using the ``numpy`` namespace directly. For >>> example: >>> >>> .. code:: python >>> >>> # calls the appropriate version of np.something for x and y >>> module = np.get_array_module(x, y) >>> module.something(x, y) >>> >>> Both array creation and array conversion are supported, because >>> dispatching is >>> handled by ``get_array_module`` rather than via the types of function >>> arguments. For example, to use random number generation functions or >>> methods, >>> we can simply pull out the appropriate submodule: >>> >>> .. code:: python >>> >>> def duckarray_add_random(array): >>> module = np.get_array_module(array) >>> noise = module.random.randn(*array.shape) >>> return array + noise >>> >>> We can also write the duck-array ``stack`` function from `NEP 30 >>> <https://numpy.org/neps/nep-0030-duck-array-protocol.html>`_, without >>> the need >>> for a new ``np.duckarray`` function: >>> >>> .. code:: python >>> >>> def duckarray_stack(arrays): >>> module = np.get_array_module(*arrays) >>> arrays = [module.asarray(arr) for arr in arrays] >>> shapes = {arr.shape for arr in arrays} >>> if len(shapes) != 1: >>> raise ValueError('all input arrays must have the same shape') >>> expanded_arrays = [arr[module.newaxis, ...] for arr in arrays] >>> return module.concatenate(expanded_arrays, axis=0) >>> >>> By default, ``get_array_module`` will return the ``numpy`` module if no >>> arguments are arrays. This fall-back can be explicitly controlled by >>> providing >>> the ``module`` keyword-only argument. It is also possible to indicate >>> that an >>> exception should be raised instead of returning a default array module by >>> setting ``module=None``. >>> >>> How to implement ``__array_module__`` >>> ===================================== >>> >>> Libraries implementing a duck array type that want to support >>> ``get_array_module`` need to implement the corresponding protocol, >>> ``__array_module__``. This new protocol is based on Python's dispatch >>> protocol >>> for arithmetic, and is essentially a simpler version of >>> ``__array_function__``. >>> >>> Only one argument is passed into ``__array_module__``, a Python >>> collection of >>> unique array types passed into ``get_array_module``, i.e., all arguments >>> with >>> an ``__array_module__`` attribute. >>> >>> The special method should either return an namespace with an API matching >>> ``numpy``, or ``NotImplemented``, indicating that it does not know how to >>> handle the operation: >>> >>> .. code:: python >>> >>> class MyArray: >>> def __array_module__(self, types): >>> if not all(issubclass(t, MyArray) for t in types): >>> return NotImplemented >>> return my_array_module >>> >>> Returning custom objects from ``__array_module__`` >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> ``my_array_module`` will typically, but need not always, be a Python >>> module. >>> Returning a custom objects (e.g., with functions implemented via >>> ``__getattr__``) may be useful for some advanced use cases. >>> >>> For example, custom objects could allow for partial implementations of >>> duck >>> array modules that fall-back to NumPy (although this is not recommended >>> in >>> general because such fall-back behavior can be error prone): >>> >>> .. code:: python >>> >>> class MyArray: >>> def __array_module__(self, types): >>> if all(issubclass(t, MyArray) for t in types): >>> return ArrayModule() >>> else: >>> return NotImplemented >>> >>> class ArrayModule: >>> def __getattr__(self, name): >>> import base_module >>> return getattr(base_module, name, getattr(numpy, name)) >>> >>> Subclassing from ``numpy.ndarray`` >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> All of the same guidance about well-defined type casting hierarchies from >>> NEP-18 still applies. ``numpy.ndarray`` itself contains a matching >>> implementation of ``__array_module__``, which is convenient for >>> subclasses: >>> >>> .. code:: python >>> >>> class ndarray: >>> def __array_module__(self, types): >>> if all(issubclass(t, ndarray) for t in types): >>> return numpy >>> else: >>> return NotImplemented >>> >>> NumPy's internal machinery >>> ========================== >>> >>> The type resolution rules of ``get_array_module`` follow the same model >>> as >>> Python and NumPy's existing dispatch protocols: subclasses are called >>> before >>> super-classes, and otherwise left to right. ``__array_module__`` is >>> guaranteed >>> to be called only a single time on each unique type. >>> >>> The actual implementation of `get_array_module` will be in C, but should >>> be >>> equivalent to this Python code: >>> >>> .. code:: python >>> >>> def get_array_module(*arrays, default=numpy): >>> implementing_arrays, types = >>> _implementing_arrays_and_types(arrays) >>> if not implementing_arrays and default is not None: >>> return default >>> for array in implementing_arrays: >>> module = array.__array_module__(types) >>> if module is not NotImplemented: >>> return module >>> raise TypeError("no common array module found") >>> >>> def _implementing_arrays_and_types(relevant_arrays): >>> types = [] >>> implementing_arrays = [] >>> for array in relevant_arrays: >>> t = type(array) >>> if t not in types and hasattr(t, '__array_module__'): >>> types.append(t) >>> # Subclasses before superclasses, otherwise left to right >>> index = len(implementing_arrays) >>> for i, old_array in enumerate(implementing_arrays): >>> if issubclass(t, type(old_array)): >>> index = i >>> break >>> implementing_arrays.insert(index, array) >>> return implementing_arrays, types >>> >>> Relationship with ``__array_ufunc__`` and ``__array_function__`` >>> ---------------------------------------------------------------- >>> >>> These older protocols have distinct use-cases and should remain >>> =============================================================== >>> >>> ``__array_module__`` is intended to resolve limitations of >>> ``__array_function__``, so it is natural to consider whether it could >>> entirely >>> replace ``__array_function__``. This would offer dual benefits: (1) >>> simplifying >>> the user-story about how to override NumPy and (2) removing the slowdown >>> associated with checking for dispatch when calling every NumPy function. >>> >>> However, ``__array_module__`` and ``__array_function__`` are pretty >>> different >>> from a user perspective: it requires explicit calls to >>> ``get_array_function``, >>> rather than simply reusing original ``numpy`` functions. This is >>> probably fine >>> for *libraries* that rely on duck-arrays, but may be frustratingly >>> verbose for >>> interactive use. >>> >>> Some of the dispatching use-cases for ``__array_ufunc__`` are also >>> solved by >>> ``__array_module__``, but not all of them. For example, it is still >>> useful to >>> be able to define non-NumPy ufuncs (e.g., from Numba or SciPy) in a >>> generic way >>> on non-NumPy arrays (e.g., with dask.array). >>> >>> Given their existing adoption and distinct use cases, we don't think it >>> makes >>> sense to remove or deprecate ``__array_function__`` and >>> ``__array_ufunc__`` at >>> this time. >>> >>> Mixin classes to implement ``__array_function__`` and ``__array_ufunc__`` >>> ========================================================================= >>> >>> Despite the user-facing differences, ``__array_module__`` and a module >>> implementing NumPy's API still contain sufficient functionality needed to >>> implement dispatching with the existing duck array protocols. >>> >>> For example, the following mixin classes would provide sensible defaults >>> for >>> these special methods in terms of ``get_array_module`` and >>> ``__array_module__``: >>> >>> .. code:: python >>> >>> class ArrayUfuncFromModuleMixin: >>> >>> def __array_ufunc__(self, ufunc, method, *inputs, **kwargs): >>> arrays = inputs + kwargs.get('out', ()) >>> try: >>> array_module = np.get_array_module(*arrays) >>> except TypeError: >>> return NotImplemented >>> >>> try: >>> # Note this may have false positive matches, if >>> ufunc.__name__ >>> # matches the name of a ufunc defined by NumPy. >>> Unfortunately >>> # there is no way to determine in which module a ufunc >>> was >>> # defined. >>> new_ufunc = getattr(array_module, ufunc.__name__) >>> except AttributeError: >>> return NotImplemented >>> >>> try: >>> callable = getattr(new_ufunc, method) >>> except AttributeError: >>> return NotImplemented >>> >>> return callable(*inputs, **kwargs) >>> >>> class ArrayFunctionFromModuleMixin: >>> >>> def __array_function__(self, func, types, args, kwargs): >>> array_module = self.__array_module__(types) >>> if array_module is NotImplemented: >>> return NotImplemented >>> >>> # Traverse submodules to find the appropriate function >>> modules = func.__module__.split('.') >>> assert modules[0] == 'numpy' >>> for submodule in modules[1:]: >>> module = getattr(module, submodule, None) >>> new_func = getattr(module, func.__name__, None) >>> if new_func is None: >>> return NotImplemented >>> >>> return new_func(*args, **kwargs) >>> >>> To make it easier to write duck arrays, we could also add these mixin >>> classes >>> into ``numpy.lib.mixins`` (but the examples above may suffice). >>> >>> Alternatives considered >>> ----------------------- >>> >>> Naming >>> ====== >>> >>> We like the name ``__array_module__`` because it mirrors the existing >>> ``__array_function__`` and ``__array_ufunc__`` protocols. Another >>> reasonable >>> choice could be ``__array_namespace__``. >>> >>> It is less clear what the NumPy function that calls this protocol should >>> be >>> called (``get_array_module`` in this proposal). Some possible >>> alternatives: >>> ``array_module``, ``common_array_module``, ``resolve_array_module``, >>> ``get_namespace``, ``get_numpy``, ``get_numpylike_module``, >>> ``get_duck_array_module``. >>> >>> .. _requesting-restricted-subsets: >>> >>> Requesting restricted subsets of NumPy's API >>> ============================================ >>> >>> Over time, NumPy has accumulated a very large API surface, with over 600 >>> attributes in the top level ``numpy`` module alone. It is unlikely that >>> any >>> duck array library could or would want to implement all of these >>> functions and >>> classes, because the frequently used subset of NumPy is much smaller. >>> >>> We think it would be useful exercise to define "minimal" subset(s) of >>> NumPy's >>> API, omitting rarely used or non-recommended functionality. For example, >>> minimal NumPy might include ``stack``, but not the other stacking >>> functions >>> ``column_stack``, ``dstack``, ``hstack`` and ``vstack``. This could >>> clearly >>> indicate to duck array authors and users want functionality is core and >>> what >>> functionality they can skip. >>> >>> Support for requesting a restricted subset of NumPy's API would be a >>> natural >>> feature to include in ``get_array_function`` and ``__array_module__``, >>> e.g., >>> >>> .. code:: python >>> >>> # array_module is only guaranteed to contain "minimal" NumPy >>> array_module = np.get_array_module(*arrays, request='minimal') >>> >>> To facilitate testing with NumPy and use with any valid duck array >>> library, >>> NumPy itself would return restricted versions of the ``numpy`` module >>> when >>> ``get_array_module`` is called only on NumPy arrays. Omitted functions >>> would >>> simply not exist. >>> >>> Unfortunately, we have not yet figured out what these restricted subsets >>> should >>> be, so it doesn't make sense to do this yet. When/if we do, we could >>> either add >>> new keyword arguments to ``get_array_module`` or add new top level >>> functions, >>> e.g., ``get_minimal_array_module``. We would also need to add either a >>> new >>> protocol patterned off of ``__array_module__`` (e.g., >>> ``__array_module_minimal__``), or could add an optional second argument >>> to >>> ``__array_module__`` (catching errors with ``try``/``except``). >>> >>> A new namespace for implicit dispatch >>> ===================================== >>> >>> Instead of supporting overrides in the main `numpy` namespace with >>> ``__array_function__``, we could create a new opt-in namespace, e.g., >>> ``numpy.api``, with versions of NumPy functions that support >>> dispatching. These >>> overrides would need new opt-in protocols, e.g., >>> ``__array_function_api__`` >>> patterned off of ``__array_function__``. >>> >>> This would resolve the biggest limitations of ``__array_function__`` by >>> being >>> opt-in and would also allow for unambiguously overriding functions like >>> ``asarray``, because ``np.api.asarray`` would always mean "convert an >>> array-like object." But it wouldn't solve all the dispatching needs met >>> by >>> ``__array_module__``, and would leave us with supporting a considerably >>> more >>> complex protocol both for array users and implementors. >>> >>> We could potentially implement such a new namespace *via* the >>> ``__array_module__`` protocol. Certainly some users would find this >>> convenient, >>> because it is slightly less boilerplate. But this would leave users with >>> a >>> confusing choice: when should they use `get_array_module` vs. >>> `np.api.something`. Also, we would have to add and maintain a whole new >>> module, >>> which is considerably more expensive than merely adding a function. >>> >>> Dispatching on both types and arrays instead of only types >>> ========================================================== >>> >>> Instead of supporting dispatch only via unique array types, we could also >>> support dispatch via array objects, e.g., by passing an ``arrays`` >>> argument as >>> part of the ``__array_module__`` protocol. This could potentially be >>> useful for >>> dispatch for arrays with metadata, such provided by Dask and Pint, but >>> would >>> impose costs in terms of type safety and complexity. >>> >>> For example, a library that supports arrays on both CPUs and GPUs might >>> decide >>> on which device to create a new arrays from functions like ``ones`` >>> based on >>> input arguments: >>> >>> .. code:: python >>> >>> class Array: >>> def __array_module__(self, types, arrays): >>> useful_arrays = tuple(a in arrays if isinstance(a, Array)) >>> if not useful_arrays: >>> return NotImplemented >>> prefer_gpu = any(a.prefer_gpu for a in useful_arrays) >>> return ArrayModule(prefer_gpu) >>> >>> class ArrayModule: >>> def __init__(self, prefer_gpu): >>> self.prefer_gpu = prefer_gpu >>> >>> def __getattr__(self, name): >>> import base_module >>> base_func = getattr(base_module, name) >>> return functools.partial(base_func, >>> prefer_gpu=self.prefer_gpu) >>> >>> This might be useful, but it's not clear if we really need it. Pint >>> seems to >>> get along OK without any explicit array creation routines (favoring >>> multiplication by units, e.g., ``np.ones(5) * ureg.m``), and for the >>> most part >>> Dask is also OK with existing ``__array_function__`` style overides >>> (e.g., >>> favoring ``np.ones_like`` over ``np.ones``). Choosing whether to place >>> an array >>> on the CPU or GPU could be solved by `making array creation lazy >>> <https://github.com/google/jax/pull/1668>`_. >>> >>> .. _appendix-design-choices: >>> >>> Appendix: design choices for API overrides >>> ------------------------------------------ >>> >>> There is a large range of possible design choices for overriding NumPy's >>> API. >>> Here we discuss three major axes of the design decision that guided our >>> design >>> for ``__array_module__``. >>> >>> Opt-in vs. opt-out for users >>> ============================ >>> >>> The ``__array_ufunc__`` and ``__array_function__`` protocols provide a >>> mechanism for overriding NumPy functions *within NumPy's existing >>> namespace*. >>> This means that users need to explicitly opt-out if they do not want any >>> overridden behavior, e.g., by casting arrays with ``np.asarray()``. >>> >>> In theory, this approach lowers the barrier for adopting these protocols >>> in >>> user code and libraries, because code that uses the standard NumPy >>> namespace is >>> automatically compatible. But in practice, this hasn't worked out. For >>> example, >>> most well-maintained libraries that use NumPy follow the best practice of >>> casting all inputs with ``np.asarray()``, which they would have to >>> explicitly >>> relax to use ``__array_function__``. Our experience has been that making >>> a >>> library compatible with a new duck array type typically requires at >>> least a >>> small amount of work to accommodate differences in the data model and >>> operations >>> that can be implemented efficiently. >>> >>> These opt-out approaches also considerably complicate backwards >>> compatibility >>> for libraries that adopt these protocols, because by opting in as a >>> library >>> they also opt-in their users, whether they expect it or not. For winning >>> over >>> libraries that have been unable to adopt ``__array_function__``, an >>> opt-in >>> approach seems like a must. >>> >>> Explicit vs. implicit choice of implementation >>> ============================================== >>> >>> Both ``__array_ufunc__`` and ``__array_function__`` have implicit >>> control over >>> dispatching: the dispatched functions are determined via the appropriate >>> protocols in every function call. This generalizes well to handling many >>> different types of objects, as evidenced by its use for implementing >>> arithmetic >>> operators in Python, but it has two downsides: >>> >>> 1. *Speed*: it imposes additional overhead in every function call, >>> because each >>> function call needs to inspect each of its arguments for overrides. >>> This is >>> why arithmetic on builtin Python numbers is slow. >>> 2. *Readability*: it is not longer immediately evident to readers of >>> code what >>> happens when a function is called, because the function's >>> implementation >>> could be overridden by any of its arguments. >>> >>> In contrast, importing a new library (e.g., ``import dask.array as >>> da``) with >>> an API matching NumPy is entirely explicit. There is no overhead from >>> dispatch >>> or ambiguity about which implementation is being used. >>> >>> Explicit and implicit choice of implementations are not mutually >>> exclusive >>> options. Indeed, most implementations of NumPy API overrides via >>> ``__array_function__`` that we are familiar with (namely, dask, CuPy and >>> sparse, but not Pint) also include an explicit way to use their version >>> of >>> NumPy's API by importing a module directly (``dask.array``, ``cupy`` or >>> ``sparse``, respectively). >>> >>> Local vs. non-local vs. global control >>> ====================================== >>> >>> The final design axis is how users control the choice of API: >>> >>> - **Local control**, as exemplified by multiple dispatch and Python >>> protocols for >>> arithmetic, determines which implementation to use either by checking >>> types >>> or calling methods on the direct arguments of a function. >>> - **Non-local control** such as `np.errstate >>> < >>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.errstate.html >>> >`_ >>> overrides behavior with global-state via function decorators or >>> context-managers. Control is determined hierarchically, via the >>> inner-most >>> context. >>> - **Global control** provides a mechanism for users to set default >>> behavior, >>> either via function calls or configuration files. For example, >>> matplotlib >>> allows setting a global choice of plotting backend. >>> >>> Local control is generally considered a best practice for API design, >>> because >>> control flow is entirely explicit, which makes it the easiest to >>> understand. >>> Non-local and global control are occasionally used, but generally either >>> due to >>> ignorance or a lack of better alternatives. >>> >>> In the case of duck typing for NumPy's public API, we think non-local or >>> global >>> control would be mistakes, mostly because they **don't compose well**. >>> If one >>> library sets/needs one set of overrides and then internally calls a >>> routine >>> that expects another set of overrides, the resulting behavior may be very >>> surprising. Higher order functions are especially problematic, because >>> the >>> context in which functions are evaluated may not be the context in which >>> they >>> are defined. >>> >>> One class of override use cases where we think non-local and global >>> control are >>> appropriate is for choosing a backend system that is guaranteed to have >>> an >>> entirely consistent interface, such as a faster alternative >>> implementation of >>> ``numpy.fft`` on NumPy arrays. However, these are out of scope for the >>> current >>> proposal, which is focused on duck arrays. >>> >>> _______________________________________________ >>> NumPy-Discussion mailing >>> listNumPy-Discussion@python.orghttps://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion