[Numpy-discussion] Re: NEP 50 and cast safety for scalar assignment/conversions

Sebastian Berg Sat, 22 Oct 2022 01:04:07 -0700

On Fri, 2022-10-21 at 17:17 -0600, Aaron Meurer wrote:
> I'm probably not understanding all the subtleties here. In the
> documentation for can_cast (and other places), it says, "'safe' means
> only casts which can preserve values are allowed." So by that
> definition, I think 'safe' casting should disallow 5000 to be cast to



Yes, but we never look at the actual value normally (NumPy does
currently for 0-D arrays, but I doubt we want to continue that [1]).

So casting:

   np.array([100, 100], dtype=np.int64) -> int8

is unsafe even though it is safe when you look at the values.

But for (Python) scalar assignment, we want to look at the value.  In
particularly:

    np.add(np.int8(1), 500)

should error (in the future) because we need to convert the 500 to
`int8`.  But the default casting for ufuncs is "same kind" and if you
write:

    np.add(np.int8(1), np.int64(500), casting="unsafe", dtype="int8")

NumPy would happily do the operation.

- Sebastian



> int8, because it would not preserve the value. IMO the definition of
> "value" is more vague when considering whether 100.0 can be cast to
> int8. Personally I don't think float -> int should ever be considered
> 'safe' even if the numeric value technically is preserved. It's also
> ambiguous whether "preserve value" applies to the inputs or just the
> outputs, i.e., is np.add(100, 100, dtpye=int8) safe?
> 
> The can_cast() documentation says, "Returns True if cast between data
> types can occur according to the casting rule." 'safe' is by
> definition a value-based cast. So my expectation is that for a
> non-value based cast there should be a new type of casting rule that
> is non-value based only.
> 
> Unless your entire suggestion is to change the definition of 'safe'
> to
> not be value-based. I wasn't completely clear about that.
> 
> Aaron Meurer
> 
> On Thu, Oct 20, 2022 at 7:30 AM Sebastian Berg
> <sebast...@sipsolutions.net> wrote:
> > 
> > Hi all,
> > 
> > I am happy that we have the correct integer handling for NEP 50
> > merged,
> > so the relevant parts of the proposal can now be tested. [1]
> > 
> > However, this has highlighted that NumPy has problems with applying
> > the
> > "cast safety" logic to scalars.  We had discussed this a bit
> > yesterday,
> > and this is an attempt to summarize the issue and thoughts on how
> > to
> > "resolve" it.
> > 
> > This mainly affects Python int, float, and complex due to their
> > special
> > handling with NEP 50.
> > 
> > 
> > NumPy has the cast safety concept for converting between different
> > dtypes:
> >   
> > https://numpy.org/doc/stable/reference/generated/numpy.can_cast.html
> > 
> > It uses "same-kind" in ufuncs (users do not usually notice this
> > unless
> > `out=` or `dtype=` is used).
> > NumPy otherwise tends to use "unsafe" for casts and assignments by
> > default which can lead to undefined/strange results at times.
> > 
> > 
> > Since casts/assignment use "unsafe" casting, scalars are often
> > converted in a non-safe way.  However, there are certain
> > exceptions:
> > 
> >     np.arange(5)[3] = np.nan  # Errors (an unsafe cast would not)
> > 
> > More importantly, NEP 50 requires the following to error:
> > 
> >     np.uint8(3) + 5000  # 5000 cannot be converted to uint8
> > 
> > And we just put in a deprecation that would always disallow the
> > above!
> > But what would the answer to:
> > 
> >     np.can_cast(5000, np.uint8, casting="safe/same_kind/unsafe")
> > 
> > be?  And how to resolve the fact that casting scalars and arrays
> > has a
> > different notion of "safety"?
> > 
> > I could imagine two main approaches:
> > 
> > * cast-safety doesn't apply to scalar conversions, they are
> > whatever
> >   they currently are (sometimes unsafe, sometimes same-kind, but
> >   strictly safe for integer assignments).
> >   `np.can_cast(5000, np.uint8)` just errors out.  We have an
> > assignment
> >   "safety" that is independent of casting safety.
> > 
> >   For `np.add(np.uint8(5), 100, casting="safe")` the "safe" (or
> >   other modes) simply doesn't make sense for the `100` since
> >   effectively the assignment "safety" is used.
> > 
> > * Scalar conversions also have a cast-safety and it may inspect the
> >   value.
> > 
> > The problem with defining cast-safety for scalar conversion is not
> > implementing it, but rather how to (not?) resolve the
> > inconsistencies.
> > 
> > Even if we change the default casting for assignments to "same
> > kind" (a
> > deprecation also applied to arrays):
> > 
> >     int8_arr[3] = 5000
> > 
> > should presumably be an error (not even "unsafe"), but:
> > 
> >     np.can_cast(np.int64, np.int8, casting="same_kind")
> > 
> > returns `True` (an int64 could be 5000 as well), and `same_kind` is
> > what ufuncs also use.
> > 
> > 
> > I don't have a clear plan on this right now, my best thought is
> > that we
> > live with the inconsistency:
> > 
> >     np.can_cast(100, np.int8)
> > 
> > would be "safe" while:
> > 
> >     np.can_cast(100., np.int8)
> > 
> > would be "unsafe" (and other conversions through `__int__`).  And:
> > 
> >     np.can_cast(1000, dtype=np.int8)
> > 
> > would always return `False` (the assignment would fail), even
> > though
> > that is not what would happen when casting integers.
> > More confusingly, maybe:
> > 
> >     np.can_cast(1000., dtype=npint8)
> > 
> > is "unsafe" and making it an error (completely unsafe) might be a
> > future deprecation.
> > 
> > That would add a cast-safety that is slightly inconsistent between
> > Python integer and NumPy integers
> > 
> > Cheers,
> > 
> > Sebastian
> > 
> > 
> > 
> > [1] The NEP: https://numpy.org/neps/nep-0050-scalar-promotion.html
> > 
> > The new part is mainly that `np.uint8(5) + 300` will now give the
> > proposed error (when opting in).
> > Calls that use `casting=` or `can_cast()` may not have the fully
> > correct future behavior, but these should be very niche.
> > 
> > 
> > [2] A bit tricky to define, but right now:
> > 
> >       arr.astype(new_dtype, casting="safe").astype(arr.dtype)
> > 
> >     should always round-trip correctly.
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: asmeu...@gmail.com
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: NEP 50 and cast safety for scalar assignment/conversions

Reply via email to