On Wed, 2019-06-05 at 21:35 -0400, Marten van Kerkwijk wrote: > Hi Sebastian, > > Tricky! It seems a balance between unexpected memory blow-up and > unexpected wrapping (the latter mostly for integers). > > Some comments specifically on your message first, then some more > general related ones. > > 1. I'm very much against letting `a + b` do anything else than > `np.add(a, b)`.
Well, I tend to agree. But just to put it out there: [1] + [2] == [1, 2] np.add([1], [2]) == 3 So that is already far from true, since coercion has to occur. Of course it is true that: arr + something_else will at some point force coercion of `something_else`, so that point is only half valid if either `a` or `b` is already a numpy array/scalar. > 2. For python values, an argument for casting by value is that a > python int can be arbitrarily long; the only reasonable course of > action for those seems to make them float, and once you do that one > might as well cast to whatever type can hold the value (at least > approximately). To be honest, the "arbitrary long" thing is another issue, which is the silent conversion to "object" dtype. Something that is also on the not done list of: Maybe we should deprecate it. In other words, we would freeze python int to one clear type, if you have an arbitrarily large int, you would need to use `object` dtype (or preferably a new `pyint/arbitrary_precision_int` dtype) explicitly. > 3. Not necessarily preferred, but for casting of scalars, one can get > more consistent behaviour also by extending the casting by value to > any array that has size=1. > That sounds just as horrible as the current mismatch to me, to be honest. > Overall, just on the narrow question, I'd be quite happy with your > suggestion of using type information if available, i.e., only cast > python values to a minimal dtype.If one uses numpy types, those > mostly will have come from previous calculations with the same > arrays, so things will work as expected. And in most memory-limited > applications, one would do calculations in-place anyway (or, as Tyler > noted, for power users one can assume awareness of memory and thus > the incentive to tell explicitly what dtype is wanted - just > `np.add(a, b, dtype=...)`, no need to create `out`). > > More generally, I guess what I don't like about the casting rules > generally is that there is a presumption that if the value can be > cast, the operation will generally succeed. For `np.add` and > `np.subtract`, this perhaps is somewhat reasonable (though for > unsigned a bit more dubious), but for `np.multiply` or `np.power` it > is much less so. (Indeed, we had a long discussion about what to do > with `int ** power` - now special-casing negative integer powers.) > Changing this, however, probably really is a bridge too far! Indeed that is right. But that is a different point. E.g. there is nothing wrong for example that `np.power` shouldn't decide that `int**power` should always _promote_ (not cast) `int` to some larger integer type if available. The only point where we seriously have such logic right now is for np.add.reduce (sum) and np.multiply.reduce (prod), which always use at least `long` precision (and actually upcast bool->int, although np.add(True, True) does not. Another difference to True + True...) > > Finally, somewhat related: I think the largest confusing actually > results from the `uint64+in64 -> float64` casting. Should this cast > to int64 instead? Not sure, but yes, it is the other quirk in our casting that should be discussed…. - Sebastian > > All the best, > > Marten > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion