Hi all,

This email is about two 2.0 release related topics:
1. Advice/guidance for downstream library authors and end users
2. Strategy for development work around public API changes that will be
breaking backwards compatibility.

I also just created https://github.com/numpy/numpy/issues/24300 as a
tracking issue that we can post announcements on and anyone can subscribe
to. I imagine many folks will want some way to follow along and be notified
of important changes around the release, but not subscribe to this mailing
list.

Some of the content of that issue overlaps with this email. In case of
questions/comments about the content of that issue, let's discuss it here.
Please keep that tracking issue for announcements, not for technical
discussion.


**Advice or downstream package authors and end users**

1. If you rely on the NumPy C API (e.g. via direct use in C/C++, or via
Cython code that uses NumPy), please add a `numpy<2.0` requirement in your
package's dependency metadata. Rationale: the NumPy C ABI will change in
2.0, so any compiled extension modules that rely on NumPy are likely to
break, they need to be recompiled.

2. If you rely on a large API surface from NumPy's Python API, also
consider adding the same ` numpy<2.0` requirement to your metadata.
Rationale: we will do a significant cleanup (see NEP 52), so unless you
only use modern/recommended functions and objects, your code is likely to
require at least some adjustments.

3. Consider cleaning up your code. E.g. remove `from numpy import *`, or
importing any private modules like `numpy.core`. See
https://github.com/numpy/numpy/blob/main/numpy/tests/test_public_api.py#L114-L126
for what we consider public/private. If it's not in the NumPy docs or in
the list of public modules there, don't use it!

4. Plan to do a release of your own packages which depend on `numpy`
shortly after the first NumPy 2.0 release candidate is released (probably
in Dec 2023). Rationale: at that point, you can release packages that will
work with both 2.0 and 1.X, and hence your own end users will not be seeing
much/any disruption (you want `pip install mypacackage` to continue working
on the day NumPy 2.0 is released).

5. Consider testing against NumPy nightlies in your own CI. We publish
those at https://anaconda.org/scientific-python-nightly-wheels/numpy, and
have documented that as a stable location at
https://numpy.org/devdocs/dev/depending_on_numpy.html. Rationale: this will
detect potential issues in your code so you can fix them well ahead of the
NumPy 2.0 release.



**Strategy for public API changes for 2.0**

Based on experience over the past weeks with adding deprecations and making
breaking changes, I think it'd be good to articulate a strategy for
Python/C API changes. We are not yet collectively used to the change of
pace that the run-up to a major release is. I think we want to use and
balance these two principles:

1. Make the API and behavior changes that we want to see for 2.0, in a way
that doesn't incur unreasonable amounts of effort or get completely blocked
by backwards compatibility constraints which we'd apply for a regular minor
release.
2. Mitigate the inevitable disturbances for downstream projects and end
users as best as we can.

To start with (2), issuing good guidance (like in the section above and the
tracking issue gh-24300) is one way to help. Another one is to, when you
are in the process of making a change, use code search tools and either
code that will break downstream proactively or at least notify downstream
project authors. The former can be done by sending PRs to at least the
largest projects (SciPy, Pandas, scikit-learn, scikit-image, Matplotlib)
when there are easy changes to make. And otherwise by filing issues on the
issue tracker of other projects. Yet another way, in case of mechanical
changes like removal of aliases, would be to provide a sed script that
others can run to automatically update their code as much as possible.

For (1), I think it's important to understand that 2.0 is a one-off change
of pace where our regular backwards compatibility policy does not apply. If
particular functions are desirable to touch but widely used, it may be wise
to leave them in or deprecate them of course - this is a case by case
decision. However, it doesn't have to be done like that. Every single
object in the NumPy API is used somewhere, and removing it is going to
affect some users/packages. This is inevitable, and we can't achieve our
2.0 goals if every little niche API change is going to require following
our regular backwards compatibility strategy. The only thing that would do
is to make 2.2 the breaking release.

Many downstream libraries have CI setups that turn deprecation warnings
into hard errors, hence often it doesn't matter whether we deprecate
something in `main`, or remove it straight away. Apply good judgement here
I'd say (how widely is something used, is the replacement trivial or does
it require some thought, etc.).

Here are four recent examples where we had downstream breakage and either
did extra work to revert a change or discussed doing so:
- Adding np.min, np.max and np.round to the `__all__` dict: this already
happened in 1.25.0, and we reverted it after discussion in
https://github.com/numpy/numpy/pull/24234 (but left it in the `main` branch
for 2.0).
- Removing `np.cast` in `main`: this broke SciPy and as a result also
JAX/MNE-Python/AstroPy CI. We left it in, but discussed in
https://github.com/numpy/numpy/pull/24144 whether or not to revert the
removal.
- Moving PyArray_MIN/PyArray_MAX to a different header file: this broke
SciPy's CI. We left it as is and fixed up the issue in SciPy in
https://github.com/numpy/numpy/pull/24234
- Removal of `np.byte_bounds`: we removed it from the main namespace in
https://github.com/numpy/numpy/pull/23830, and after discussion on that
issue are bringing it back for now (first in its old location in
https://github.com/numpy/numpy/pull/24154, and then moving it to a new
namespace under `np.lib` once we sort that out).

This is going to happen a lot more I'm sure. We have to be careful about
what we do, but also we need downstream library authors to be proactive,
audit their code and use the nightlies we publish. For very niche APIs like
`np.cast` we should expect to not have to justify removal of them in detail
or entertain reverting changes made in preparation of 2.0.

Final thought: we should get a 1.26.0b1 beta release out shortly, but will
likely have a couple of weeks before 1.26.0rc1. So any deprecations we want
to put into 1.26.0 can go in until the RC1 release.

Cheers,
Ralf
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to