On Wed, 2019-06-05 at 21:35 -0400, Marten van Kerkwijk wrote: > Hi Sebastian, > > Tricky! It seems a balance between unexpected memory blow-up and > unexpected wrapping (the latter mostly for integers). > > Some comments specifically on your message first, then some more > general related ones. > > 1. I'm very much against letting `a + b` do anything else than > `np.add(a, b)`. > 2. For python values, an argument for casting by value is that a > python int can be arbitrarily long; the only reasonable course of > action for those seems to make them float, and once you do that one > might as well cast to whatever type can hold the value (at least > approximately).
Just to throw it in, in the long run, instead of trying to find a minimal dtype (which is a bit random), simply ignoring the value of the scalar may actually be the better option. The reason for this would be code like: ``` arr = np.zeros(5, dtype=np.int8) for i in range(200): res = arr + i print(res.dtype) # switches from int8 to int16! ``` Instead, try `np.int8(i)` in the loop, and if it fails raise an error. Or, if that is a bit nasty – especially for interactive usage – we would go with a warning. This is nothing we need to decide soon, since I think some of the complexity will remain (i.e. you still need to know that the scalar is a floating point number or an integer and change the logic). Best, Sebastian > 3. Not necessarily preferred, but for casting of scalars, one can get > more consistent behaviour also by extending the casting by value to > any array that has size=1. > > Overall, just on the narrow question, I'd be quite happy with your > suggestion of using type information if available, i.e., only cast > python values to a minimal dtype.If one uses numpy types, those > mostly will have come from previous calculations with the same > arrays, so things will work as expected. And in most memory-limited > applications, one would do calculations in-place anyway (or, as Tyler > noted, for power users one can assume awareness of memory and thus > the incentive to tell explicitly what dtype is wanted - just > `np.add(a, b, dtype=...)`, no need to create `out`). > > More generally, I guess what I don't like about the casting rules > generally is that there is a presumption that if the value can be > cast, the operation will generally succeed. For `np.add` and > `np.subtract`, this perhaps is somewhat reasonable (though for > unsigned a bit more dubious), but for `np.multiply` or `np.power` it > is much less so. (Indeed, we had a long discussion about what to do > with `int ** power` - now special-casing negative integer powers.) > Changing this, however, probably really is a bridge too far! > > Finally, somewhat related: I think the largest confusing actually > results from the `uint64+in64 -> float64` casting. Should this cast > to int64 instead? > > All the best, > > Marten > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion