Re: [Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

2020-04-09 Thread Ralf Gommers
On Wed, Mar 4, 2020 at 1:22 AM Sebastian Berg 
wrote:

> On Sun, 2020-02-23 at 22:44 -0800, Stephan Hoyer wrote:
> > On Sun, Feb 23, 2020 at 3:59 PM Ralf Gommers 
> > wrote:
> > >
> > > On Sun, Feb 23, 2020 at 3:31 PM Stephan Hoyer 
> > > wrote:
> > > > On Thu, Feb 6, 2020 at 12:20 PM Sebastian Berg <
> > > > sebast...@sipsolutions.net> wrote:
> 
> > > >
> > > > I don't think NumPy needs to do anything about warnings. It is
> > > > straightforward for libraries that want to use use
> > > > get_array_module() to issue their own warnings before calling
> > > > get_array_module(), if desired.
> > > >
> > > > Or alternatively, if a library is about to add a new
> > > > __array_module__ method, it is straightforward to issue a warning
> > > > inside the new __array_module__ method before returning the NumPy
> > > > functions.
> > > >
> > >
> > > I don't think this is quite enough. Sebastian points out a fairly
> > > important issue. One of the main rationales for the whole NEP, and
> > > the argument in multiple places (
> > >
> https://numpy.org/neps/nep-0037-array-module.html#opt-in-vs-opt-out-for-users
> > > ) is that it's now opt-in while __array_function__ was opt-out.
> > > This isn't really true - the problem is simply *moved*, from the
> > > duck array libraries to the array-consuming libraries. The end user
> > > will still see the backwards incompatible change, with no way to
> > > turn it off. It will be easier with __array_module__ to warn users,
> > > but this should be expanded on in the NEP.
> > >
> >
> > Ralf, thanks for sharing your thoughts.
>

Sorry, this never made it back to the top of my todo list.

>
> > I'm not quite I understand the concerns about backwards
> > incompatibility:
> > 1. The intention is that implementing a __array_module__ method
> > should be backwards compatible with all current uses of NumPy. This
> > satisfies backwards compatibility concerns for an array-implementing
> > library like JAX.
> > 2. In contrast, calling get_array_module() offers no guarantees about
> > backwards compatibility. This seems nearly impossible, because the
> > entire point of the protocol is to make it possible to opt-in to new
> > behavior.


Indeed, it is nearly impossible. Except if there's a context manager or
some other control mechanism exposed to the end user. Hence that should be
a part of the design I think. Otherwise you're just solving something for
the JAX devs, but not for the scikit-learn/scipy/etc devs who will then
each have to invent their own wheel for backwards compat.

So backwards compatibility isn't solved for Scikit-Learn
> > switching to use get_array_module(), and after Scikit-Learn does so,
> > adding __array_module__ to new types of arrays could potentially have
> > backwards incompatible consequences for Scikit-Learn (unless sklearn
> > uses default=None).
> >
> > Are you suggesting just adding something like what I'm writing here
> > into the NEP? Perhaps along with advice to consider issuing warnings
> > inside __array_module__  and falling back to legacy behavior when
> > first implementing it on a new type?
>
> I think that should be sufficient, personally. We could mention that
> scikit-learn will likely use a context manager to do this.
> We can also think about providing a global default (which sklearn can
> use as its own default if they wish so, but that is reserved to the
> end-user).
>

+1

That would be a small amendment, and I think we could add it even after
> accepting the NEP as it is.
>
> >
> > We could also potentially make a few changes to make backwards
> > compatibility even easier, by making the protocol less aggressive
> > about assuming that NumPy is a safe fallback. Some non-exclusive
> > options:
> > a. We could switch the default value of "default" on
> > get_array_module() to None, so an exception is raised if nothing
> > implements __array_module__.
>
> I am not sure that I feel switching the default to None makes much of a
> difference to be honest. Unless we use it to signal a super strict mode
> similar to b. below.
>

I agree, that doesn't make a difference.


> > b. We could includes *all* argument types in "types", not just types
> > that implement __array_module__. NumPy's ndarray.__array_module__
> > could then recognize and refuse to return an implementation if there
> > are other arguments that might implement __array_module__ in the
> > future (e.g., anything outside the standard library?).
>
> That is a good point, anything that is not NumPy recognized could
> simply be rejected. It does mean that you have to call
> `module.asarray()` manually more often though.
> For `list`, it could also make sense to just add np.ndarray to types.
>
> If we want to be conservative, maybe we could also just error before
> calling `__array_module__`.  Whenever there is something that we do not
> know how to interpret force the user to clarify?
>
> >
> > The downside of making either of these choices is that it would
> > potentially make get_

Re: [Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

2020-04-09 Thread Ralf Gommers
On Thu, Apr 9, 2020 at 12:02 AM Sebastian Berg 
wrote:

> On Wed, 2020-04-08 at 17:04 -0400, Andreas Mueller wrote:
> > Hey all.
> > Is there any update on this? Is there any input we can provide as
> > users?
> > I'm not entirely sure where you are in the decision making process
> > right
> > now :)
> >
>
> Hey,
>
> thanks for the ping. Things are a bit stuck right now. I think what we
> need is some clarity on the implications and alternatives.
> I was thinking about organizing a small conference call with the main
> people interested in the next weeks.
>
> There are also still some alternatives to this NEP in the race, and we
> may need to clarify which ones are actually still in the race...
>
>
> Maybe to see some of the possible sticking points:
>
> 1. What do we do about SciPy, have it under this umbrella? And how
> would we want to design that.
>

Current feeling: best to ignore it for now. It's quite a bit of work to fix
API incompatibilities for linalg that no one currently seems interested in
tackling. We can revisit once that's done.


> 2. Context managers have some composition issues, maybe less so if they
> are in the downstream package. Or should we have global defaults as
> well?
>

+1 for adding this right next to get_array_module().


> 3. How do we ensure safe transitions for users as much as possible.
>* If you use this, can functions suddenly return a different type
>  in the future?
>* Should we force you to cast to NumPy arrays in a transition
>  period, or force you to somehow silence a transition warning?
>
> 4. Is there a serious push to have a "reduced" API or even a versioned
> API?
>

There is, it'll take a few months.

>
> But I am probably forgetting some other things.
>
>
> In my personal opinion, I think NEP 37 with minor modifications is
> still the best duck in the race. I feel we should be able to find a
> reasonable solution for SciPy.
> Point 2. about Context managers may be true, but this is much smaller
> in scope from the ones uarray proposed IIRC, and I could not figure out
> major scoping issues with it yet (the sklearn draft).
>
> About the safe transition, that may be the stickiest point. But e.g. if
> you enable `get_array_module` sklearn could limit a certain function to
> error out if it finds something other than NumPy?
> The main problem is how to do opt-in into future behaviour. A context
> manager can do that, although the danger is that someone just uses that
> everywhere...
>
> On the reduced/versioned API front, I would hope that we can defer that
> as a semi-orthogonal issue, basically saying that for now you have to
> provide a NumPy API that faithfully reproduces whatever NumPy version
> is installed on the system.
>

I think it would be nice to have a separate NEP 37 implementation outside
of NumPy to play with. Unlike __array_function__, I don't think it has to
go into NumPy immediately. This avoids the whole "experimental API" issue,
it would be quite useful to test this with, e.g., CuPy + scikit-learn
without being stuck with any decisions in a released NumPy version. Also
makes switching on/off very easy for users, just (don't) `pip install
numpy-array-module`.

Cheers,
Ralf


> Cheers,
>
> Sebastian
>
>
> > Cheers,
> > Andy
> >
> > On 3/3/20 6:34 PM, Sebastian Berg wrote:
> > > On Fri, 2020-02-28 at 11:28 -0500, Allan Haldane wrote:
> > > > On 2/23/20 6:59 PM, Ralf Gommers wrote:
> > > > > One of the main rationales for the whole NEP, and the argument
> > > > > in
> > > > > multiple places
> > > > > (
> > > > >
> https://numpy.org/neps/nep-0037-array-module.html#opt-in-vs-opt-out-for-users
> > > > > )
> > > > > is that it's now opt-in while __array_function__ was opt-out.
> > > > > This
> > > > > isn't
> > > > > really true - the problem is simply *moved*, from the duck
> > > > > array
> > > > > libraries to the array-consuming libraries. The end user will
> > > > > still
> > > > > see
> > > > > the backwards incompatible change, with no way to turn it off.
> > > > > It
> > > > > will
> > > > > be easier with __array_module__ to warn users, but this should
> > > > > be
> > > > > expanded on in the NEP.
> > > > Might it be possible to flip this NEP back to opt-out while
> > > > keeping
> > > > the
> > > > nice simplifications and configurabile array-creation routines,
> > > > relative
> > > > to __array_function__?
> > > >
> > > > That is, what if we define two modules, "numpy" and
> > > > "numpy_strict".
> > > > "numpy_strict" would raise an exception on duck-arrays defining
> > > > __array_module__ (as numpy currently does). "numpy" would be a
> > > > wrapper
> > > > around "numpy_strict" that decorates all numpy methods with a
> > > > call to
> > > > "get_array_module(inputs).func(inputs)".
> > > This would be possible, but I think we strongly leaned against the
> > > idea. Basically, if you have to opt-out, from a library perspective
> > > there may be `np.asarray` calls, which for example later call into
> > > C
> > > and e

Re: [Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

2020-04-09 Thread Sebastian Berg
On Thu, 2020-04-09 at 13:52 +0200, Ralf Gommers wrote:
> On Wed, Mar 4, 2020 at 1:22 AM Sebastian Berg <
> sebast...@sipsolutions.net>
> wrote:
> 
> > On Sun, 2020-02-23 at 22:44 -0800, Stephan Hoyer wrote:
> > > On Sun, Feb 23, 2020 at 3:59 PM Ralf Gommers <
> > > ralf.gomm...@gmail.com>
> > > wrote:
> > > > On Sun, Feb 23, 2020 at 3:31 PM Stephan Hoyer  > > > >
> > > > wrote:
> > > > > On Thu, Feb 6, 2020 at 12:20 PM Sebastian Berg <
> > > > > sebast...@sipsolutions.net> wrote:
> > 
> > > > > I don't think NumPy needs to do anything about warnings. It
> > > > > is
> > > > > straightforward for libraries that want to use use
> > > > > get_array_module() to issue their own warnings before calling
> > > > > get_array_module(), if desired.
> > > > > 
> > > > > Or alternatively, if a library is about to add a new
> > > > > __array_module__ method, it is straightforward to issue a
> > > > > warning
> > > > > inside the new __array_module__ method before returning the
> > > > > NumPy
> > > > > functions.
> > > > > 
> > > > 
> > > > I don't think this is quite enough. Sebastian points out a
> > > > fairly
> > > > important issue. One of the main rationales for the whole NEP,
> > > > and
> > > > the argument in multiple places (
> > > > 
> > https://numpy.org/neps/nep-0037-array-module.html#opt-in-vs-opt-out-for-users
> > > > ) is that it's now opt-in while __array_function__ was opt-out.
> > > > This isn't really true - the problem is simply *moved*, from
> > > > the
> > > > duck array libraries to the array-consuming libraries. The end
> > > > user
> > > > will still see the backwards incompatible change, with no way
> > > > to
> > > > turn it off. It will be easier with __array_module__ to warn
> > > > users,
> > > > but this should be expanded on in the NEP.
> > > > 
> > > 
> > > Ralf, thanks for sharing your thoughts.
> 
> Sorry, this never made it back to the top of my todo list.
> 
> > > I'm not quite I understand the concerns about backwards
> > > incompatibility:
> > > 1. The intention is that implementing a __array_module__ method
> > > should be backwards compatible with all current uses of NumPy.
> > > This
> > > satisfies backwards compatibility concerns for an array-
> > > implementing
> > > library like JAX.
> > > 2. In contrast, calling get_array_module() offers no guarantees
> > > about
> > > backwards compatibility. This seems nearly impossible, because
> > > the
> > > entire point of the protocol is to make it possible to opt-in to
> > > new
> > > behavior.
> 
> Indeed, it is nearly impossible. Except if there's a context manager
> or
> some other control mechanism exposed to the end user. Hence that
> should be
> a part of the design I think. Otherwise you're just solving something
> for
> the JAX devs, but not for the scikit-learn/scipy/etc devs who will
> then
> each have to invent their own wheel for backwards compat.
> 
> So backwards compatibility isn't solved for Scikit-Learn
> > > switching to use get_array_module(), and after Scikit-Learn does
> > > so,
> > > adding __array_module__ to new types of arrays could potentially
> > > have
> > > backwards incompatible consequences for Scikit-Learn (unless
> > > sklearn
> > > uses default=None).
> > > 
> > > Are you suggesting just adding something like what I'm writing
> > > here
> > > into the NEP? Perhaps along with advice to consider issuing
> > > warnings
> > > inside __array_module__  and falling back to legacy behavior when
> > > first implementing it on a new type?
> > 
> > I think that should be sufficient, personally. We could mention
> > that
> > scikit-learn will likely use a context manager to do this.
> > We can also think about providing a global default (which sklearn
> > can
> > use as its own default if they wish so, but that is reserved to the
> > end-user).
> > 
> 
> +1
> 
> That would be a small amendment, and I think we could add it even
> after
> > accepting the NEP as it is.
> > 
> > > We could also potentially make a few changes to make backwards
> > > compatibility even easier, by making the protocol less aggressive
> > > about assuming that NumPy is a safe fallback. Some non-exclusive
> > > options:
> > > a. We could switch the default value of "default" on
> > > get_array_module() to None, so an exception is raised if nothing
> > > implements __array_module__.
> > 
> > I am not sure that I feel switching the default to None makes much
> > of a
> > difference to be honest. Unless we use it to signal a super strict
> > mode
> > similar to b. below.
> > 
> 
> I agree, that doesn't make a difference.
> 
> 
> > > b. We could includes *all* argument types in "types", not just
> > > types
> > > that implement __array_module__. NumPy's ndarray.__array_module__
> > > could then recognize and refuse to return an implementation if
> > > there
> > > are other arguments that might implement __array_module__ in the
> > > future (e.g., anything outside the standard library?).
> > 
> > That is a good point, anything that is not Nu

Re: [Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

2020-04-09 Thread Sebastian Berg
On Thu, 2020-04-09 at 13:52 +0200, Ralf Gommers wrote:
> On Thu, Apr 9, 2020 at 12:02 AM Sebastian Berg <
> sebast...@sipsolutions.net>
> wrote:
> 

> > 
> 
> I think it would be nice to have a separate NEP 37 implementation
> outside
> of NumPy to play with. Unlike __array_function__, I don't think it
> has to
> go into NumPy immediately. This avoids the whole "experimental API"
> issue,

Fair enough, I have created a hopefully working start here:

https://github.com/seberg/numpy_dispatch

(this is not tested much at all yet, so it could be very buggy).

There are a couple of additional features that I added.

1. A global opt-in (it is impossible to opt-out once opted in!)
2. A local opt-in (to guarantee opt-in if global flag is not set)
3. I added features to allow transitioning::

  get_array_module(*arrays, modules="numpy",
future_modules=("dask.array", "cupy"), fallback="warn")

   Will give FutureWarning/DeprecationWarning where necessary, in the
   above "numpy" is supported, dask and cupy are supported but not
   enabled by default. `None` works to say "all modules".
   Once the transition is done, just move dask and cupy into `modules`
   and remove `fallback=None`.
4. If there are FutureWarnings/DeprecationWarnigs the user needs to be
   able to opt-in to future behaviour. Opting out can be done by
   casting inputs. Opting-in is done using::

  with future_dispatch_behavior():
  call_library_function()

Obviously, we may not want these features, but I was curious how we
could provide the tools to allow clean transitions.

Both context managers should be thread-safe, but I did not test that.

The best try would probably be cupy and sklearn again, so I will give a
ping on the sklearn PR. To make that easier, I tried to hack a bit of a
"util" to allow testing (please scroll down on the readme on github).

Best,

Sebastian



> it would be quite useful to test this with, e.g., CuPy + scikit-learn
> without being stuck with any decisions in a released NumPy version.
> Also
> makes switching on/off very easy for users, just (don't) `pip install
> numpy-array-module`.
> 
> Cheers,
> Ralf



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

2020-04-09 Thread Sebastian Berg
On Thu, 2020-04-09 at 22:11 -0500, Sebastian Berg wrote:
> On Thu, 2020-04-09 at 13:52 +0200, Ralf Gommers wrote:
> > On Thu, Apr 9, 2020 at 12:02 AM Sebastian Berg <
> > sebast...@sipsolutions.net>
> > wrote:
> > 
> 
> > 
> > I think it would be nice to have a separate NEP 37 implementation
> > outside
> > of NumPy to play with. Unlike __array_function__, I don't think it
> > has to
> > go into NumPy immediately. This avoids the whole "experimental API"
> > issue,
> 
> Fair enough, I have created a hopefully working start here:
> 
> https://github.com/seberg/numpy_dispatch
> 
> (this is not tested much at all yet, so it could be very buggy).
> 
> There are a couple of additional features that I added.
> 
> 1. A global opt-in (it is impossible to opt-out once opted in!)
> 2. A local opt-in (to guarantee opt-in if global flag is not set)
> 3. I added features to allow transitioning::
> 
>   get_array_module(*arrays, modules="numpy",
> future_modules=("dask.array", "cupy"), fallback="warn")


There is no immediate need to put modules and future_modules and
fallback in there. The main convenience it gives is that we can more
easily provide the user to opt-in context manager to opt-in to the new
behaviour.
Without that, libraries will have to do these checks, that is not
difficult. But if we wish to provide a context manager to opt all of
that in, the library will need additional API to query our context
manager state. Or every library needs their own solution, which does
not seem desirable (although it means you cannot opt-in internal
functions accidentally to newer behaviour).

- Sebastian

> 
>Will give FutureWarning/DeprecationWarning where necessary, in the
>above "numpy" is supported, dask and cupy are supported but not
>enabled by default. `None` works to say "all modules".
>Once the transition is done, just move dask and cupy into
> `modules`
>and remove `fallback=None`.
> 4. If there are FutureWarnings/DeprecationWarnigs the user needs to
> be
>able to opt-in to future behaviour. Opting out can be done by
>casting inputs. Opting-in is done using::
> 
>   with future_dispatch_behavior():
>   call_library_function()
> 
> Obviously, we may not want these features, but I was curious how we
> could provide the tools to allow clean transitions.
> 
> Both context managers should be thread-safe, but I did not test that.
> 
> The best try would probably be cupy and sklearn again, so I will give
> a
> ping on the sklearn PR. To make that easier, I tried to hack a bit of
> a
> "util" to allow testing (please scroll down on the readme on github).
> 
> Best,
> 
> Sebastian
> 
> 
> 
> > it would be quite useful to test this with, e.g., CuPy + scikit-
> > learn
> > without being stuck with any decisions in a released NumPy version.
> > Also
> > makes switching on/off very easy for users, just (don't) `pip
> > install
> > numpy-array-module`.
> > 
> > Cheers,
> > Ralf
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion