[Numpy-discussion] Re: NEP 50 and cast safety for scalar assignment/conversions

Aaron Meurer Fri, 21 Oct 2022 16:18:33 -0700

I'm probably not understanding all the subtleties here. In the
documentation for can_cast (and other places), it says, "'safe' means
only casts which can preserve values are allowed." So by that
definition, I think 'safe' casting should disallow 5000 to be cast to
int8, because it would not preserve the value. IMO the definition of
"value" is more vague when considering whether 100.0 can be cast to
int8. Personally I don't think float -> int should ever be considered
'safe' even if the numeric value technically is preserved. It's also
ambiguous whether "preserve value" applies to the inputs or just the
outputs, i.e., is np.add(100, 100, dtpye=int8) safe?


The can_cast() documentation says, "Returns True if cast between data
types can occur according to the casting rule." 'safe' is by
definition a value-based cast. So my expectation is that for a
non-value based cast there should be a new type of casting rule that
is non-value based only.

Unless your entire suggestion is to change the definition of 'safe' to
not be value-based. I wasn't completely clear about that.

Aaron Meurer

On Thu, Oct 20, 2022 at 7:30 AM Sebastian Berg
<sebast...@sipsolutions.net> wrote:
>
> Hi all,
>
> I am happy that we have the correct integer handling for NEP 50 merged,
> so the relevant parts of the proposal can now be tested. [1]
>
> However, this has highlighted that NumPy has problems with applying the
> "cast safety" logic to scalars.  We had discussed this a bit yesterday,
> and this is an attempt to summarize the issue and thoughts on how to
> "resolve" it.
>
> This mainly affects Python int, float, and complex due to their special
> handling with NEP 50.
>
>
> NumPy has the cast safety concept for converting between different
> dtypes:
>   https://numpy.org/doc/stable/reference/generated/numpy.can_cast.html
>
> It uses "same-kind" in ufuncs (users do not usually notice this unless
> `out=` or `dtype=` is used).
> NumPy otherwise tends to use "unsafe" for casts and assignments by
> default which can lead to undefined/strange results at times.
>
>
> Since casts/assignment use "unsafe" casting, scalars are often
> converted in a non-safe way.  However, there are certain exceptions:
>
>     np.arange(5)[3] = np.nan  # Errors (an unsafe cast would not)
>
> More importantly, NEP 50 requires the following to error:
>
>     np.uint8(3) + 5000  # 5000 cannot be converted to uint8
>
> And we just put in a deprecation that would always disallow the above!
> But what would the answer to:
>
>     np.can_cast(5000, np.uint8, casting="safe/same_kind/unsafe")
>
> be?  And how to resolve the fact that casting scalars and arrays has a
> different notion of "safety"?
>
> I could imagine two main approaches:
>
> * cast-safety doesn't apply to scalar conversions, they are whatever
>   they currently are (sometimes unsafe, sometimes same-kind, but
>   strictly safe for integer assignments).
>   `np.can_cast(5000, np.uint8)` just errors out.  We have an assignment
>   "safety" that is independent of casting safety.
>
>   For `np.add(np.uint8(5), 100, casting="safe")` the "safe" (or
>   other modes) simply doesn't make sense for the `100` since
>   effectively the assignment "safety" is used.
>
> * Scalar conversions also have a cast-safety and it may inspect the
>   value.
>
> The problem with defining cast-safety for scalar conversion is not
> implementing it, but rather how to (not?) resolve the inconsistencies.
>
> Even if we change the default casting for assignments to "same kind" (a
> deprecation also applied to arrays):
>
>     int8_arr[3] = 5000
>
> should presumably be an error (not even "unsafe"), but:
>
>     np.can_cast(np.int64, np.int8, casting="same_kind")
>
> returns `True` (an int64 could be 5000 as well), and `same_kind` is
> what ufuncs also use.
>
>
> I don't have a clear plan on this right now, my best thought is that we
> live with the inconsistency:
>
>     np.can_cast(100, np.int8)
>
> would be "safe" while:
>
>     np.can_cast(100., np.int8)
>
> would be "unsafe" (and other conversions through `__int__`).  And:
>
>     np.can_cast(1000, dtype=np.int8)
>
> would always return `False` (the assignment would fail), even though
> that is not what would happen when casting integers.
> More confusingly, maybe:
>
>     np.can_cast(1000., dtype=npint8)
>
> is "unsafe" and making it an error (completely unsafe) might be a
> future deprecation.
>
> That would add a cast-safety that is slightly inconsistent between
> Python integer and NumPy integers
>
> Cheers,
>
> Sebastian
>
>
>
> [1] The NEP: https://numpy.org/neps/nep-0050-scalar-promotion.html
>
> The new part is mainly that `np.uint8(5) + 300` will now give the
> proposed error (when opting in).
> Calls that use `casting=` or `can_cast()` may not have the fully
> correct future behavior, but these should be very niche.
>
>
> [2] A bit tricky to define, but right now:
>
>       arr.astype(new_dtype, casting="safe").astype(arr.dtype)
>
>     should always round-trip correctly.
>
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: asmeu...@gmail.com
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: NEP 50 and cast safety for scalar assignment/conversions

Reply via email to