[Numpy-discussion] Representation of NumPy scalars
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that? Hi all, I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars): https://numpy.org/neps/nep-0050-scalar-promotion.html In relation to that, there was one point that Stéfan brought up previously. The NumPy scalars (representation) currently print as numbers: >>> np.float32(34.3) 34.3 >>> np.uint8(5) 5 That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP). The change would be that we should print as: float64(34.3) (or similar?) This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it. If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1]. Cheers, Sebastian [1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
Hello Sebastian, I rarely use NumPy scalars directly, but the repr change could have impact in assorted downstream projects' documentation. For clarity, this idea would not alter how NumPy arrays print, would it - since they already include the type information? >>> np.array([34.3, 10.1, -0.5], np.float32) array([34.3, 10.1, -0.5], dtype=float32) >>> np.array([5, 10, 0], np.uint8) array([ 5, 10, 0], dtype=uint8) Thanks, Peter On Thu, Sep 8, 2022 at 10:42 AM Sebastian Berg wrote: > > TL;DR: NumPy scalars representation is e.g. `34.3` instead of > `float32(34.3)`. So the representation is missing the type > information. What are your thoughts on changing that? > > > Hi all, > > I am thinking about the next steps for NEP 50 (The NEP wants to fix the > NumPy promotion rules, especially with respect to scalars): > > https://numpy.org/neps/nep-0050-scalar-promotion.html > > In relation to that, there was one point that Stéfan brought up > previously. > > The NumPy scalars (representation) currently print as numbers: > > >>> np.float32(34.3) > 34.3 > >>> np.uint8(5) > 5 > > That can already be confusing now. However, it gets more problematic > if NEP 50 is introduced since the behavior between a Python `34.3` and > `np.float32(34.3)` would differ more than it does now (please refer to > the NEP). > > The change would be that we should print as: > > float64(34.3) (or similar?) > > This Email is mainly to ask for any feedback or concern on such a > change. I suspect we may have to write a very brief NEP about it. > > If there is little concern, maybe we could move forward such a change > promptly. Otherwise it could be moved forward together with NEP 50 and > take effect in a "major" release [1]. > > Cheers, > > Sebastian > > > > [1] Note that for me, even a major release would hopefully not affect > the majority of users or be very disruptive. > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: p.j.a.c...@googlemail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
On Thu, 2022-09-08 at 10:53 +0100, Peter Cock wrote: > Hello Sebastian, > > I rarely use NumPy scalars directly, but the repr change could > have impact in assorted downstream projects' documentation. > > For clarity, this idea would not alter how NumPy arrays print, > would it - since they already include the type information? > Yes. Array representation is not confusing in the same way. You are right of course. Documentation would be affected quite heavily and would require a lot of docs to be fixed up unfortunately. My hope would be that there is little impact besides documentation, but I am not certain. - Sebastian > > > > np.array([34.3, 10.1, -0.5], np.float32) > array([34.3, 10.1, -0.5], dtype=float32) > > > > np.array([5, 10, 0], np.uint8) > array([ 5, 10, 0], dtype=uint8) > > Thanks, > > Peter > > On Thu, Sep 8, 2022 at 10:42 AM Sebastian Berg < > sebast...@sipsolutions.net> > wrote: > > > > > TL;DR: NumPy scalars representation is e.g. `34.3` instead of > > `float32(34.3)`. So the representation is missing the type > > information. What are your thoughts on changing that? > > > > > > Hi all, > > > > I am thinking about the next steps for NEP 50 (The NEP wants to fix > > the > > NumPy promotion rules, especially with respect to scalars): > > > > https://numpy.org/neps/nep-0050-scalar-promotion.html > > > > In relation to that, there was one point that Stéfan brought up > > previously. > > > > The NumPy scalars (representation) currently print as numbers: > > > > >>> np.float32(34.3) > > 34.3 > > >>> np.uint8(5) > > 5 > > > > That can already be confusing now. However, it gets more > > problematic > > if NEP 50 is introduced since the behavior between a Python `34.3` > > and > > `np.float32(34.3)` would differ more than it does now (please refer > > to > > the NEP). > > > > The change would be that we should print as: > > > > float64(34.3) (or similar?) > > > > This Email is mainly to ask for any feedback or concern on such a > > change. I suspect we may have to write a very brief NEP about it. > > > > If there is little concern, maybe we could move forward such a > > change > > promptly. Otherwise it could be moved forward together with NEP 50 > > and > > take effect in a "major" release [1]. > > > > Cheers, > > > > Sebastian > > > > > > > > [1] Note that for me, even a major release would hopefully not > > affect > > the majority of users or be very disruptive. > > > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: p.j.a.c...@googlemail.com > > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
On Thu, 8 Sept 2022, 19:42 Sebastian Berg, wrote: > > TL;DR: NumPy scalars representation is e.g. `34.3` instead of > `float32(34.3)`. So the representation is missing the type > information. What are your thoughts on changing that? > From the Python documentation on repr: >From the Python documentation on repr: "this should look like a valid Python expression that could be used to recreate an object with the same value" I think it definitely we should definitely have: repr(np.float32(34.3)) == 'float32(34.3)' And str(np.float32(34.3)) == '34.3' It seems buglike not to have that. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
On Thu, Sep 8, 2022 at 3:41 AM Stefano Miccoli wrote: > On 8 Sep 2022, at 11:39, numpy-discussion-requ...@python.org wrote: > > TL;DR: NumPy scalars representation is e.g. `34.3` instead of > `float32(34.3)`. So the representation is missing the type > information. What are your thoughts on changing that? > > > This would be a VERY welcome change! > +1 this would be very welcome! The current behavior is a major source of confusion. Users end up using NumPy scalars accidentally all over the place without even realizing it, leading to all sorts of surprising and challenging to debug bugs. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
+1 from me. That would be really helpful. On Fri, 9 Sept 2022 at 05:18, Stephan Hoyer wrote: > On Thu, Sep 8, 2022 at 3:41 AM Stefano Miccoli > wrote: > >> On 8 Sep 2022, at 11:39, numpy-discussion-requ...@python.org wrote: >> >> TL;DR: NumPy scalars representation is e.g. `34.3` instead of >> `float32(34.3)`. So the representation is missing the type >> information. What are your thoughts on changing that? >> >> >> This would be a VERY welcome change! >> > > +1 this would be very welcome! > > The current behavior is a major source of confusion. Users end up using > NumPy scalars accidentally all over the place without even realizing it, > leading to all sorts of surprising and challenging to debug bugs. > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: pyt...@2sn.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
On 9/8/22, Andrew Nelson wrote: > On Thu, 8 Sept 2022, 19:42 Sebastian Berg, > wrote: > >> >> TL;DR: NumPy scalars representation is e.g. `34.3` instead of >> `float32(34.3)`. So the representation is missing the type >> information. What are your thoughts on changing that? I like the idea, but as others have noted, this could result in a lot of churn in the docs of many projects. > > >> From the Python documentation on repr: > > > From the Python documentation on repr: > > "this should look like a valid Python expression that could be used to > recreate an object with the same value" To quote from https://docs.python.org/3/library/functions.html#repr: > For many types, this function makes an attempt to return a string > that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.) If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits): ``` In [161]: longpi = np.longdouble('3.14159265358979323846') In [162]: longpi Out[162]: 3.1415926535897932385 In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116 In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ``` Warren > > I think it definitely we should definitely have: > > repr(np.float32(34.3)) == 'float32(34.3)' > And > str(np.float32(34.3)) == '34.3' > > It seems buglike not to have that. > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Ufuncs and dtypes: New possibilities in NumPy - video
A video recording of Sebastian Berg’s presentation “Ufuncs and dtypes: New possibilities in NumPy” has been posted on the NumPy YouTube channel: https://youtu.be/u9qU6cy5JkE. On Tue, Sep 6, 2022 at 10:58 PM Inessa Pawson wrote: > The next NumPy Newcomers Hour will be held this Thursday, September 8th at > 4 pm UTC. > > Sebastian Berg, a senior software engineer at Nvidia and long time NumPy > core developer, will present about his work on refactoring NumPy core > functionalities, such as universal functions, casting, and dtypes. We will > discuss what has been accomplished and what new applications are now > available. > > Join us via Zoom: https://us02web.zoom.us/j/87192457898 > -- Cheers, Inessa Inessa Pawson Contributor Experience Lead | NumPy https://numpy.org/ GitHub: inessapawson ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
On 9/9/22 04:15, Warren Weckesser wrote: ... To quote from https://docs.python.org/3/library/functions.html#repr: For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.) If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits): ``` In [161]: longpi = np.longdouble('3.14159265358979323846') In [162]: longpi Out[162]: 3.1415926535897932385 In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116 In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ``` Warren As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be perfectly roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations. Matti ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Representation of NumPy scalars
I am in favor of such a change. It will make what is returned more transparent to users (and reduce confusion for newcomers). With NEP50, we're already adopting a philosophy of explicit scalar usage anyway: no longer pretending or trying to make transparent that Python floats and NumPy floats are the same. No one *actually* round-trips objects via repr, but if a user could look at a result and know how to construct the object, that is an improvement. Stéfan On Thu, Sep 8, 2022, at 22:26, Matti Picus wrote: > On 9/9/22 04:15, Warren Weckesser wrote: >> ... >> To quote from https://docs.python.org/3/library/functions.html#repr: >> >>> For many types, this function makes an attempt to return a string >>> that would yield an object with the same value when passed to eval(); >> Sebastian, is this an explicit goal of the change? (Personally, I've >> gotten used to not taking this too seriously, but my world view is >> biased by the long-term use of NumPy, which has never followed this >> guideline.) >> >> If that is a goal, than the floating point types with precision >> greater than double precision will need to display the argument of the >> type as a string. For example, the following is run on a platform >> where numpy.longdouble is extended precision (80 bits): >> >> ``` >> In [161]: longpi = np.longdouble('3.14159265358979323846') >> >> In [162]: longpi >> Out[162]: 3.1415926535897932385 >> >> In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed >> as 64 bit float >> Out[163]: 3.141592653589793116 >> >> In [164]: np.longdouble('3.1415926535897932385') # Correctly >> reproduces the longdouble >> Out[164]: 3.1415926535897932385 >> ``` >> >> Warren > > > As others have mentioned, the change will greatly enhance UX at the cost > of documentation cleanups. While the representation may not be perfectly > roundtrip-able, I think it still is an improvement and worthwhile. > Elsewhere I have suggested we need more documentation around > array/scalar printing, perhaps that would be a place to mention the > limitations of string representations. > > Matti ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com