[Numpy-discussion] NEP 50: Promotion rules for Python scalars
Hi all, I would like to share the first formal draft of NEP 50: Promotion rules for Python scalars with everyone. The full text can be found here: https://numpy.org/neps/nep-0050-scalar-promotion.html NEP 50 is an attempt to remove value-based casting/promotion. We wish to replace it with clearer rules for the resulting dtype when mixing NumPy arrays and Python scalars. As a brief example, the proposal allows the following (unchanged): >>> np.array([1, 2, 3], dtype=np.int8) + 100 np.array([101, 102, 103], dtype=np.int8) While clearing up confusion caused by the value-inspecting behavior that we see sometimes, such as: >>> np.array([1, 2, 3], dtype=np.int8) + 300 np.array([301, 302, 303], dtype=np.int16) # note the int16 Where 300 is too large to fit an ``int8``. As well as removing the special behavior of 0-D arrays or NumPy scalars: >>> res = np.array(1, dtype=np.int8) + 100 >>> res.dtype dtype('int64') This is the continuation of a long discussion (see the "Discussion" section), including the poll I once posted: https://discuss.scientific-python.org/t/poll-future-numpy-behavior-when-mixing-arrays-numpy-scalars-and-python-scalars/202 I would be happy for any feadback, be it just editorial or fundamental discussion. There are many alternatives which I have tried to capture in the NEP. So lets discuss here, or on discuss: https://discuss.scientific-python.org/t/nep-50-promotion-rules-for-python-scalars/280 For smaller edits, don't hesitate to open a NumPy PR, or propose edits on my branch (you can use the edit button to create a PR): https://github.com/seberg/numpy/blob/nep50/doc/neps/nep-0050-scalar-promotion.rst An important part of moving forward will be assessing the real world impact. To start that process, I have created a branch as a draft PR (at this time): https://github.com/numpy/numpy/pull/21626 It is missing some parts, but should allow preliminary testing. The main missing part is that the integer warnings and errors are less strict than proposed in the NEP. It would be invaluable to get a better idea to what extent existing code, especially end-user code, is affected by the proposed changes. Thanks in advance for any input! This is a big, complicated proposal, but finding a way forward will hopefully clear up a source of confusion and inconsistencies that make both maintainers and users life harder. Cheers, Sebastian signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP 50: Promotion rules for Python scalars
On Wed, Jun 1, 2022 at 5:51 PM Sebastian Berg wrote: > > An important part of moving forward will be assessing the real world > impact. To start that process, I have created a branch as a draft PR > (at this time): > > https://github.com/numpy/numpy/pull/21626 > > It is missing some parts, but should allow preliminary testing. The > main missing part is that the integer warnings and errors are less > strict than proposed in the NEP. > It would be invaluable to get a better idea to what extent existing > code, especially end-user code, is affected by the proposed changes. > Thanks Sebastian! For testing, did you already try with some of the usual suspects, or would it be helpful to use this branch on SciPy, Pandas, etc.? Also, do you expect it's useful to do platform-specific testing? I can imagine there's some Windows-specific behavior; adapting a SciPy CI job to work from your branch is easy to do if that would be helpful. Cheers, Ralf ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP 50: Promotion rules for Python scalars
On Wed, 2022-06-01 at 20:23 +0200, Ralf Gommers wrote: > On Wed, Jun 1, 2022 at 5:51 PM Sebastian Berg > > wrote: > > > > > An important part of moving forward will be assessing the real > > world > > impact. To start that process, I have created a branch as a draft > > PR > > (at this time): > > > > https://github.com/numpy/numpy/pull/21626 > > > > It is missing some parts, but should allow preliminary testing. The > > main missing part is that the integer warnings and errors are less > > strict than proposed in the NEP. > > It would be invaluable to get a better idea to what extent existing > > code, especially end-user code, is affected by the proposed > > changes. > > > > Thanks Sebastian! For testing, did you already try with some of the > usual > suspects, or would it be helpful to use this branch on SciPy, Pandas, > etc.? > Also, do you expect it's useful to do platform-specific testing? I > can > imagine there's some Windows-specific behavior; adapting a SciPy CI > job to > work from your branch is easy to do if that would be helpful. > Yes, I have for SciPy. As noted in the PR, those look "mostly harmless" on first sight (not that it won't mean quite a bit of work, but I think it is manageable work). I would be more scared if there is a need to systematically vet all places where behavior (may have) changed. For example, in NumPy: np.median(np.float32([1, 2, 3, 4])) did return a float64 before and will now return a float32. I assume because somewhere we write: `(np.float64(3) + np.float32(2)) / 2`. There a few places that I suspect just need updated test or a bit of thought. And at least one or two that need to use the correct integer types (IIRC `scipy.io.idl` seems to be using some low precision or unsigned integer type internally and that leads to failures). I thought pandas would fail much harder, but it seems only had a 150- 200 failures (many probably clustered). One larger annoyance there is that one parametrized test runs into an infinite recursion which makes it run excruciatingly slow. In any case, I believe that it would be far more helpful if those more familiar with the libraries have a look at the failures. Not only do they know better how much impact they have; it also helps to get a feel for how painful the transition will be. One problem I see, is that I still expect that libraries are not the main issue. Using a SciPy integrator may end up with a float32 rather than a float64 result. In the SciPy test suite, that probably just means tweaking the test a bit. But that same change will also break someones script out there, somewhere. So the real affected persons (who may occasionally get less precise/breaking results) are likely the end-users rather than the libraries. Cheers, Sebastian > Cheers, > Ralf > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Small API addition: unique now has `equal_nan=False` keyword argument
Hi all, this has been discussed before, so mainly a brief announcement that we merged a PR to add the `equal_nan` kwarg to `np.unique`. If set to False, multiple `NaN`s will be reported multiple times (which was the behavior prior to NumPy 1.21). The keyword argument name was chosen to match that of `np.array_equal` and the testing functions that. Since the actual release is a bit off and this has no backwards compatibility concerns, there is a chance that this may be backported into NumPy 1.23 (This is at Chucks discretion as the release manager). Cheers, Sebastian signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] API Addition: Polynomial classes now have a "symbol" denoting the variable
Hi all, just another small API announcement, that I merged: https://github.com/numpy/numpy/pull/16154 which adds `symbol="x"` to the polynomial classes. Ross' more detailed explanation is copied below. Cheers, Sebastian New attribute ``symbol`` added to polynomial classes The polynomial classes in the ``numpy.polynomial`` package have a new ``symbol`` attribute which is used to represent the indeterminate of the polynomial. This can be used to change the value of the variable when printing:: >>> P_y = np.polynomial.Polynomial([1, 0, -1], symbol="y") >>> print(P_y) 1.0 + 0.0·y¹ - 1.0·y² Note that the polynomial classes only support 1D polynomials, so operations that involve polynomials with different symbols are disallowed when the result would be multivariate:: >>> P = np.polynomial.Polynomial([1, -1]) # default symbol is "x" >>> P_z = np.polynomial.Polynomial([1, 1], symbol="z") >>> P * P_z Traceback (most recent call last) ... ValueError: Polynomial symbols differ The symbol can be any valid Python identifier. The default is ``symbol=x``, consistent with existing behavior. signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP 50: Promotion rules for Python scalars
> For example, in NumPy: > >np.median(np.float32([1, 2, 3, 4])) > > did return a float64 before and will now return a float32. I assume > because somewhere we write: `(np.float64(3) + np.float32(2)) / 2`. Sorry, I missed this part of the discussion — I know the discussion centered around Python literals being weak, but for NumPy dtypes, I thought the larger dtype would always win? Indeed, reading the NEP I see: Expression: array([1.], float32) + array(1., float64) Old result: array([2.], float32) New result: array([2.], float64) which seems to contradict your statement above? ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: NEP 50: Promotion rules for Python scalars
On Wed, 2022-06-01 at 18:37 -0500, Juan Nunez-Iglesias wrote: > > For example, in NumPy: > > > > np.median(np.float32([1, 2, 3, 4])) > > > > did return a float64 before and will now return a float32. I > > assume > > because somewhere we write: `(np.float64(3) + np.float32(2)) / 2`. > > Sorry, I missed this part of the discussion — I know the discussion > centered around Python literals being weak, but for NumPy dtypes, I > thought the larger dtype would always win? Good reading carefully enough to notice :)! Sorry... my bad, the float64 is a typo. That should have read: (float32(3) + float32(2)) / 2 Which does show the change in behavior as described/discussed. If there was a float64 involved, of course the result would be/remain float64. - Sebastian > > Indeed, reading the NEP I see: > > Expression: array([1.], float32) + array(1., float64) > Old result: array([2.], float32) > New result: array([2.], float64) > > which seems to contradict your statement above? > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com