On Apr 11, 2011, at 3:55 PM, Charles R Harris wrote: > > > On Mon, Apr 11, 2011 at 2:31 PM, Mark Wiebe <mwwi...@gmail.com> wrote: > On Mon, Apr 11, 2011 at 12:37 PM, Robert Kern <robert.k...@gmail.com> wrote: > On Mon, Apr 11, 2011 at 13:54, Skipper Seabold <jsseab...@gmail.com> wrote: > > All, > > > > We noticed some failing tests for statsmodels between numpy 1.5.1 and > > numpy >= 1.6.0. These are the versions where I noticed the change. It > > seems that when you divide a float array and multiply by a boolean > > array the answers are different (unless the others are also off by > > some floating point). Can anyone replicate the following using this > > script or point out what I'm missing? > > > > import numpy as np > > print np.__version__ > > np.random.seed(12345) > > > > test = np.random.randint(0,2,size=10).astype(bool) > > t = 1.345 > > arr = np.random.random(size=10) > > > > arr # okay > > t/arr # okay > > (1-test)*arr # okay > > (1-test)*t/arr # huh? > > [~] > |12> 1-test > array([1, 0, 0, 0, 1, 0, 1, 1, 0, 1], dtype=int8) > > [~] > |13> (1-test)*t > array([ 1.34472656, 0. , 0. , 0. , > 1.34472656, 0. , > 1.34472656, 1.34472656, 0. , 1.34472656], dtype=float16) > > Some of the recent ufunc changes or the introduction of the float16 > type must have changed the way the dtype is chosen for the > "int8-array*float64-scalar" case. Previously, the result was a float64 > array, as expected. > > Mark, I expect this is a result of one of your changes. Can you take a > look at this? > > It's the type promotion rules change that is causing this, and it's > definitely a 1.6.0 release blocker. Good catch! > > I can think of an easy, reasonable way to fix it in the result_type function, > but since the ufuncs themselves use a poor method of resolving the types, it > will take some thought to apply the same idea there. Maybe detecting that the > ufunc is a binary op at its creation time, setting a flag, and using the > result_type function in that case. This would have the added bonus of being > faster, too. > > Going back to the 1.5 code isn't an option, since adding the new data types > while maintaining ABI compatibility required major surgery to this part of > the system. > > > There isn't a big rush here, so take the time work it through.
I agree with Charles. Let's take the needed time and work this through. This is the sort of thing I was a bit nervous about with the changes made to the casting rules. Right now, I'm not confident that the scalar-array casting rules are being handled correctly. From your explanation, I don't understand how you intend to solve the problem. Surely there is a better approach than special casing the binary op and this work-around flag (and why is the result_type even part of the discussion)? It would be good to see a simple test case and understand why the boolean multiplied by the scalar double is becoming a float16. In other words, why does (1-test)*t return a float16 array This does not sound right at all and it would be good to understand why this occurs, now. How are you handling scalars multiplied by arrays in general? -Travis
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion