Re: [Numpy-discussion] Odd numerical difference between Numpy 1.5.1 and Numpy > 1.5.1

Travis Oliphant Mon, 11 Apr 2011 20:49:01 -0700

On Apr 11, 2011, at 3:55 PM, Charles R Harris wrote:

> 
> 
> On Mon, Apr 11, 2011 at 2:31 PM, Mark Wiebe <mwwi...@gmail.com> wrote:
> On Mon, Apr 11, 2011 at 12:37 PM, Robert Kern <robert.k...@gmail.com> wrote:
> On Mon, Apr 11, 2011 at 13:54, Skipper Seabold <jsseab...@gmail.com> wrote:
> > All,
> >
> > We noticed some failing tests for statsmodels between numpy 1.5.1 and
> > numpy >= 1.6.0. These are the versions where I noticed the change. It
> > seems that when you divide a float array and multiply by a boolean
> > array the answers are different (unless the others are also off by
> > some floating point). Can anyone replicate the following using this
> > script or point out what I'm missing?
> >
> > import numpy as np
> > print np.__version__
> > np.random.seed(12345)
> >
> > test = np.random.randint(0,2,size=10).astype(bool)
> > t = 1.345
> > arr = np.random.random(size=10)
> >
> > arr # okay
> > t/arr # okay
> > (1-test)*arr # okay
> > (1-test)*t/arr # huh?
> 
> [~]
> |12> 1-test
> array([1, 0, 0, 0, 1, 0, 1, 1, 0, 1], dtype=int8)
> 
> [~]
> |13> (1-test)*t
> array([ 1.34472656,  0.        ,  0.        ,  0.        ,
> 1.34472656,  0.        ,
>        1.34472656,  1.34472656,  0.        ,  1.34472656], dtype=float16)
> 
> Some of the recent ufunc changes or the introduction of the float16
> type must have changed the way the dtype is chosen for the
> "int8-array*float64-scalar" case. Previously, the result was a float64
> array, as expected.
> 
> Mark, I expect this is a result of one of your changes. Can you take a
> look at this?
> 
> It's the type promotion rules change that is causing this, and it's 
> definitely a 1.6.0 release blocker. Good catch!
> 
> I can think of an easy, reasonable way to fix it in the result_type function, 
> but since the ufuncs themselves use a poor method of resolving the types, it 
> will take some thought to apply the same idea there. Maybe detecting that the 
> ufunc is a binary op at its creation time, setting a flag, and using the 
> result_type function in that case. This would have the added bonus of being 
> faster, too.
> 
> Going back to the 1.5 code isn't an option, since adding the new data types 
> while maintaining ABI compatibility required major surgery to this part of 
> the system.
> 
> 
> There isn't a big rush here, so take the time work it through.


I agree with Charles.   Let's take the needed time and work this through.   
This is the sort of thing I was a bit nervous about with the changes made to 
the casting rules.    Right now, I'm not confident that the scalar-array 
casting rules are being handled correctly. 

From your explanation, I don't understand how you intend to solve the problem.  
 Surely there is a better approach than special casing the binary op and this 
work-around flag (and why is the result_type even part of the discussion)? 

It would be good to see a simple test case and understand why the boolean 
multiplied by the scalar double is becoming a float16.     In other words,  why 
does

(1-test)*t 

return a float16 array

This does not sound right at all and it would be good to understand why this 
occurs, now.   How are you handling scalars multiplied by arrays in general?  


-Travis

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Odd numerical difference between Numpy 1.5.1 and Numpy > 1.5.1

Reply via email to