On Fri, 2022-10-21 at 17:17 -0600, Aaron Meurer wrote: > I'm probably not understanding all the subtleties here. In the > documentation for can_cast (and other places), it says, "'safe' means > only casts which can preserve values are allowed." So by that > definition, I think 'safe' casting should disallow 5000 to be cast to
Yes, but we never look at the actual value normally (NumPy does currently for 0-D arrays, but I doubt we want to continue that [1]). So casting: np.array([100, 100], dtype=np.int64) -> int8 is unsafe even though it is safe when you look at the values. But for (Python) scalar assignment, we want to look at the value. In particularly: np.add(np.int8(1), 500) should error (in the future) because we need to convert the 500 to `int8`. But the default casting for ufuncs is "same kind" and if you write: np.add(np.int8(1), np.int64(500), casting="unsafe", dtype="int8") NumPy would happily do the operation. - Sebastian > int8, because it would not preserve the value. IMO the definition of > "value" is more vague when considering whether 100.0 can be cast to > int8. Personally I don't think float -> int should ever be considered > 'safe' even if the numeric value technically is preserved. It's also > ambiguous whether "preserve value" applies to the inputs or just the > outputs, i.e., is np.add(100, 100, dtpye=int8) safe? > > The can_cast() documentation says, "Returns True if cast between data > types can occur according to the casting rule." 'safe' is by > definition a value-based cast. So my expectation is that for a > non-value based cast there should be a new type of casting rule that > is non-value based only. > > Unless your entire suggestion is to change the definition of 'safe' > to > not be value-based. I wasn't completely clear about that. > > Aaron Meurer > > On Thu, Oct 20, 2022 at 7:30 AM Sebastian Berg > <sebast...@sipsolutions.net> wrote: > > > > Hi all, > > > > I am happy that we have the correct integer handling for NEP 50 > > merged, > > so the relevant parts of the proposal can now be tested. [1] > > > > However, this has highlighted that NumPy has problems with applying > > the > > "cast safety" logic to scalars. We had discussed this a bit > > yesterday, > > and this is an attempt to summarize the issue and thoughts on how > > to > > "resolve" it. > > > > This mainly affects Python int, float, and complex due to their > > special > > handling with NEP 50. > > > > > > NumPy has the cast safety concept for converting between different > > dtypes: > > > > https://numpy.org/doc/stable/reference/generated/numpy.can_cast.html > > > > It uses "same-kind" in ufuncs (users do not usually notice this > > unless > > `out=` or `dtype=` is used). > > NumPy otherwise tends to use "unsafe" for casts and assignments by > > default which can lead to undefined/strange results at times. > > > > > > Since casts/assignment use "unsafe" casting, scalars are often > > converted in a non-safe way. However, there are certain > > exceptions: > > > > np.arange(5)[3] = np.nan # Errors (an unsafe cast would not) > > > > More importantly, NEP 50 requires the following to error: > > > > np.uint8(3) + 5000 # 5000 cannot be converted to uint8 > > > > And we just put in a deprecation that would always disallow the > > above! > > But what would the answer to: > > > > np.can_cast(5000, np.uint8, casting="safe/same_kind/unsafe") > > > > be? And how to resolve the fact that casting scalars and arrays > > has a > > different notion of "safety"? > > > > I could imagine two main approaches: > > > > * cast-safety doesn't apply to scalar conversions, they are > > whatever > > they currently are (sometimes unsafe, sometimes same-kind, but > > strictly safe for integer assignments). > > `np.can_cast(5000, np.uint8)` just errors out. We have an > > assignment > > "safety" that is independent of casting safety. > > > > For `np.add(np.uint8(5), 100, casting="safe")` the "safe" (or > > other modes) simply doesn't make sense for the `100` since > > effectively the assignment "safety" is used. > > > > * Scalar conversions also have a cast-safety and it may inspect the > > value. > > > > The problem with defining cast-safety for scalar conversion is not > > implementing it, but rather how to (not?) resolve the > > inconsistencies. > > > > Even if we change the default casting for assignments to "same > > kind" (a > > deprecation also applied to arrays): > > > > int8_arr[3] = 5000 > > > > should presumably be an error (not even "unsafe"), but: > > > > np.can_cast(np.int64, np.int8, casting="same_kind") > > > > returns `True` (an int64 could be 5000 as well), and `same_kind` is > > what ufuncs also use. > > > > > > I don't have a clear plan on this right now, my best thought is > > that we > > live with the inconsistency: > > > > np.can_cast(100, np.int8) > > > > would be "safe" while: > > > > np.can_cast(100., np.int8) > > > > would be "unsafe" (and other conversions through `__int__`). And: > > > > np.can_cast(1000, dtype=np.int8) > > > > would always return `False` (the assignment would fail), even > > though > > that is not what would happen when casting integers. > > More confusingly, maybe: > > > > np.can_cast(1000., dtype=npint8) > > > > is "unsafe" and making it an error (completely unsafe) might be a > > future deprecation. > > > > That would add a cast-safety that is slightly inconsistent between > > Python integer and NumPy integers > > > > Cheers, > > > > Sebastian > > > > > > > > [1] The NEP: https://numpy.org/neps/nep-0050-scalar-promotion.html > > > > The new part is mainly that `np.uint8(5) + 300` will now give the > > proposed error (when opting in). > > Calls that use `casting=` or `can_cast()` may not have the fully > > correct future behavior, but these should be very niche. > > > > > > [2] A bit tricky to define, but right now: > > > > arr.astype(new_dtype, casting="safe").astype(arr.dtype) > > > > should always round-trip correctly. > > > > _______________________________________________ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: asmeu...@gmail.com > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com