Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

Travis Oliphant Tue, 18 Sep 2012 13:33:22 -0700

On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote:

> 
> 
> On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> 
> 
> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris <charlesr.har...@gmail.com> 
> wrote:
> 
> 
> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> 
> 
> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris <charlesr.har...@gmail.com> 
> wrote:
> 
> 
> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root <ben.r...@ou.edu> wrote:
> 
> 
> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <charlesr.har...@gmail.com> 
> wrote:
> 
> 
> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant <tra...@continuum.io> wrote:
> 
> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
> 
> > Consider the following code:
> >
> > import numpy as np
> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
> > a *= float(255) / 15
> >
> > In v1.6.x, this yields:
> > array([17, 34, 51, 68, 85], dtype=int16)
> >
> > But in master, this throws an exception about failing to cast via same_kind.
> >
> > Note that numpy was smart about this operation before, consider:
> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
> > a *= float(128) / 256
> 
> > yields:
> > array([0, 1, 1, 2, 2], dtype=int16)
> >
> > Of course, this is different than if one does it in a non-in-place manner:
> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
> >
> > which yields an array with floating point dtype in both versions.  I can 
> > appreciate the arguments for preventing this kind of implicit casting 
> > between non-same_kind dtypes, but I argue that because the operation is 
> > in-place, then I (as the programmer) am explicitly stating that I desire to 
> > utilize the current array to store the results of the operation, dtype and 
> > all.  Obviously, we can't completely turn off this rule (for example, an 
> > in-place addition between integer array and a datetime64 makes no sense), 
> > but surely there is some sort of happy medium that would allow these sort 
> > of operations to take place?
> >
> > Lastly, if it is determined that it is desirable to allow in-place 
> > operations to continue working like they have before, I would like to see 
> > such a fix in v1.7 because if it isn't in 1.7, then other libraries (such 
> > as matplotlib, where this issue was first found) would have to change their 
> > code anyway just to be compatible with numpy.
> 
> I agree that in-place operations should allow different casting rules.  There 
> are different opinions on this, of course, but generally this is how NumPy 
> has worked in the past.
> 
> We did decide to change the default casting rule to "same_kind" but making an 
> exception for in-place seems reasonable.
> 
> I think that in these cases same_kind will flag what are most likely 
> programming errors and sloppy code. It is easy to be explicit and doing so 
> will make the code more readable because it will be immediately obvious what 
> the multiplicand is without the need to recall what the numpy casting rules 
> are in this exceptional case. IISTR several mentions of this before (Gael?), 
> and in some of those cases it turned out that bugs were being turned up. 
> Catching bugs with minimal effort is a good thing.
> 
> Chuck 
> 
> 
> True, it is quite likely to be a programming error, but then again, there are 
> many cases where it isn't.  Is the problem strictly that we are trying to 
> downcast the float to an int, or is it that we are trying to downcast to a 
> lower precision?  Is there a way for one to explicitly relax the same_kind 
> restriction?
> 
> I think the problem is down casting across kinds, with the result that floats 
> are truncated and the imaginary parts of imaginaries might be discarded. That 
> is, the value, not just the precision, of the rhs changes. So I'd favor an 
> explicit cast in code like this, i.e., cast the rhs to an integer.
> 
> It is true that this forces downstream to code up to a higher standard, but I 
> don't see that as a bad thing, especially if it exposes bugs. And it isn't 
> difficult to fix.
> 
> Chuck 
> 
> 
> Mind you, in my case, casting the rhs as an integer before doing the 
> multiplication would be a bug, since our value for the rhs is usually between 
> zero and one.  Multiplying first by the integer numerator before dividing by 
> the integer denominator would likely cause issues with overflowing the 16 bit 
> integer.
> 
> 
> For the case in point I'd do
> 
> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
> 
> In [2]: a //= 2 
> 
> In [3]: a
> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) 
> 
> Although I expect you would want something different in practice. But the 
> current code already looks fragile to me and I think it is a good thing you 
> are taking a closer look at it. If you really intend going through a float, 
> then it should be something like
> 
> a = (a*(float(128)/256)).astype(int16)
> 
> Chuck
> 
> 
> And thereby losing the memory benefit of an in-place multiplication?
> 
> What makes you think you are getting that? I'd have to check the numpy  C 
> source, but I expect the multiplication is handled just as I wrote it out. I 
> don't recall any loops that handle mixed types likes that. I'd like to see 
> some, though, scaling integers is a common problem.


>   
> That is sort of the point of all this.  We are using 16 bit integers because 
> we wanted to be as efficient as possible and didn't need anything larger.  
> Note, that is what we changed the code to, I am just wondering if we are 
> being too cautious.  The casting kwarg looks to be what I might want, though 
> it isn't as clean as just writing an "*=" statement.
> 
> 
> I think even there you will have an intermediate float array followed by a 
> cast.

This is true, but it is done in chunks of a fixed size (controllable by a 
thread-local variable or keyword argument to the ufunc).    

How difficult would it be to change in-place operations back to the "unsafe" 
default?   

-Travis

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

Reply via email to