Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

Charles R Harris Tue, 18 Sep 2012 12:26:04 -0700

On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root <ben.r...@ou.edu> wrote:


>
>
> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root <ben.r...@ou.edu> wrote:
>>
>>>
>>>
>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <
>>> charlesr.har...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
>>>> <tra...@continuum.io>wrote:
>>>>
>>>>>
>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>>>>
>>>>> > Consider the following code:
>>>>> >
>>>>> > import numpy as np
>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>> > a *= float(255) / 15
>>>>> >
>>>>> > In v1.6.x, this yields:
>>>>> > array([17, 34, 51, 68, 85], dtype=int16)
>>>>> >
>>>>> > But in master, this throws an exception about failing to cast via
>>>>> same_kind.
>>>>> >
>>>>> > Note that numpy was smart about this operation before, consider:
>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>> > a *= float(128) / 256
>>>>>
>>>>> > yields:
>>>>> > array([0, 1, 1, 2, 2], dtype=int16)
>>>>> >
>>>>> > Of course, this is different than if one does it in a non-in-place
>>>>> manner:
>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>>>>> >
>>>>> > which yields an array with floating point dtype in both versions.  I
>>>>> can appreciate the arguments for preventing this kind of implicit casting
>>>>> between non-same_kind dtypes, but I argue that because the operation is
>>>>> in-place, then I (as the programmer) am explicitly stating that I desire 
>>>>> to
>>>>> utilize the current array to store the results of the operation, dtype and
>>>>> all.  Obviously, we can't completely turn off this rule (for example, an
>>>>> in-place addition between integer array and a datetime64 makes no sense),
>>>>> but surely there is some sort of happy medium that would allow these sort
>>>>> of operations to take place?
>>>>> >
>>>>> > Lastly, if it is determined that it is desirable to allow in-place
>>>>> operations to continue working like they have before, I would like to see
>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
>>>>> as matplotlib, where this issue was first found) would have to change 
>>>>> their
>>>>> code anyway just to be compatible with numpy.
>>>>>
>>>>> I agree that in-place operations should allow different casting rules.
>>>>>  There are different opinions on this, of course, but generally this is 
>>>>> how
>>>>> NumPy has worked in the past.
>>>>>
>>>>> We did decide to change the default casting rule to "same_kind" but
>>>>> making an exception for in-place seems reasonable.
>>>>>
>>>>
>>>> I think that in these cases same_kind will flag what are most likely
>>>> programming errors and sloppy code. It is easy to be explicit and doing so
>>>> will make the code more readable because it will be immediately obvious
>>>> what the multiplicand is without the need to recall what the numpy casting
>>>> rules are in this exceptional case. IISTR several mentions of this before
>>>> (Gael?), and in some of those cases it turned out that bugs were being
>>>> turned up. Catching bugs with minimal effort is a good thing.
>>>>
>>>> Chuck
>>>>
>>>>
>>> True, it is quite likely to be a programming error, but then again,
>>> there are many cases where it isn't.  Is the problem strictly that we are
>>> trying to downcast the float to an int, or is it that we are trying to
>>> downcast to a lower precision?  Is there a way for one to explicitly relax
>>> the same_kind restriction?
>>>
>>
>> I think the problem is down casting across kinds, with the result that
>> floats are truncated and the imaginary parts of imaginaries might be
>> discarded. That is, the value, not just the precision, of the rhs changes.
>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
>> integer.
>>
>> It is true that this forces downstream to code up to a higher standard,
>> but I don't see that as a bad thing, especially if it exposes bugs. And it
>> isn't difficult to fix.
>>
>> Chuck
>>
>>
> Mind you, in my case, casting the rhs as an integer before doing the
> multiplication would be a bug, since our value for the rhs is usually
> between zero and one.  Multiplying first by the integer numerator before
> dividing by the integer denominator would likely cause issues with
> overflowing the 16 bit integer.
>
>
For the case in point I'd do

In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16)

In [2]: a //= 2

In [3]: a
Out[3]: array([0, 1, 1, 2, 2], dtype=int16)

Although I expect you would want something different in practice. But the
current code already looks fragile to me and I think it is a good thing you
are taking a closer look at it. If you really intend going through a float,
then it should be something like

a = (a*(float(128)/256)).astype(int16)

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

Reply via email to